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GENOTYHNG BY SIMULTANEOUS ANALYSIS 
OF MULTIPLE MXCROSATELUTE LOCI 



The work leading to this invention was supported in part by Grant No. GM 47145 from 
the National Institutes of Health. TTie United States Government may retain certain rights in this 
invoition. 

BACKGROUND OF THE INVENTION 
Field of the Invention 

TTiis invention is directed to semi-automated methods for linkage mapping of the genome 
by genotyping of multiple microsatellite loci. 
Summary of Background Information 

For most genetic disorders, there is no known biochemical defect. ConsequenUy, the 
mutant genes associated with the disease and their disease-causing abnormal gene products are 
recognized solely by the anomalous phenotype they produce. Identifying the chromosomal 
localization for the gene(s) that produce these disease phenotypes is often the first crucial step 
toward isolation and characterization of the mutation(s) by recombinant DNA techniques. 
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The sigmficance of mapping a gene is perils better appreciated when put into context 
with the human genome project. Consider for a moment that even after every base of the DNA 
in the entire human genome has been sequenced through the Human Genome Initiative (HGI). 
and every gene has been loc^ i„ this sequence, it may still not be clear which disorder(s)' 
arise from which gene(s). Each disease phenotype wiU still need to be "mapped- or associated 
withaparticular location in the genome. ITus is usuaUy carried out by analy^^ 
from blood specimens coUected from individuals within femilies affected by a genetic disorder. 

Onceadisorderorabnonnalpheno^hasbeenlinkedtoaparticularregiononachrom^^ 

the limited number Of genes within this area WiU pernut us to suggestacandidate gene that^ 
contribute to the phenotype. I.us, once the localization of a major disease phenotype to a 

chromosomalregion is confirmed.afew candidate genes can beexamined for muu^^^^ 
as potential pathogenic mechanisms. 

If no g^s tove bee napped „ region, linkage sludies wim closely- spaced 
s™,unding narloers can oite, be used » delineatt a large clm,n,„s„„^ taien^^l (1-2 Mb) i„ 
Which „ search for «^bed sequences. This approach (on^y ^ 
is generau, referred u, as -positoal cloning-. In d„ pas, d» isoladon o, candidate genes 
ih>«teela,gegeno™icregio„swas^»ra..Wan8stepi„p„sitfo,alclomng,«^^^ 
of in^nsive wort However, recen. in,pr„ven,en.s in n,ed,ods u, cap«re expressed s^n««s 
encoded wiftin large genomic segn»=nB have been described, llus. a,™ is „,„ a ™«, fcr 

advances b nutenlar genedc methods emptoyed in .he linkage mapping of disease genes. 
Linkage 

Tie chromosomes are the basic units of inheritance on which genes and DNA markers 
are organized in a linear fashion (see Figure 1). Linicage is evident when a gene(s) that 
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advanced the construction of high resolutinnr 

Sn resolution Imkage maps (Weber and May IMo ^ , 

C^'.. 44:388-396; Litt and Luty 1989 ^ . . ^. 

"^'^^^^'^"^-^"^.G^n./.. 44:397-401). 

«^«'lcb(Beckmann.etal., 1992). WhereSSRm^n. u 

ui repeats; between individuals tho t 

studies. ^ '^"^ ^PP^h the ideal for linkage 

Most SSR are (Gl). dinudeotide repeat le„«K , 

---^tthereareahoutlOOOOOofr^^^^ ™^ 

^ 1 w,wo of the (GTL tvDe « 

^ ^VCT i,uuo SSR markers have hAo« ^ 
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It is now weU accq)ted that methods based on the polymerase chain reaction (PGR) and 
highly polymorphic simple sequence tep^t (SSR) markers (e.g. Figure 3) are the techniques of 
choice for genotyping in linkage studies (Weber, et al., 1989; litt, et al., 1989; Edwards, et al. 
l99l,AmJ.Hum, Genet., 42:746-56). PCR-based methods are faster and therefore less costly 
than restriction fragment length polymorphism (RFLP) methods; moreover, they do not require 
nucleic acid probes, and are more informative in linkage studies. Efforts are underway to 
develop automated techniques for genotyping that wiU further improve the efficiency of linkage 
studies utilizing this type of microsatelUte markers polymorphism. The advantages of analyzing 
multiple polymorphic loci using an automated DNA sequencer were first described by Skolnick 
and Wallace in 1988 {Genomics, 2:273-279). Building on techniques reported by Comiell. et 
al. (1987, Biotedmiques, 5:342-348), Ziegle et al., (1992 Genomics, 14:1026-1031), extended 
this approach to incorporate automated DNA sizing technology for genotyping microsatelUte loci 
using four color fluorescence-based techniques. 

However, the analysis of microsatelUte markers stiU reUes on gel electrophoresis which 
has Umited sample handUng capacity. Furthermore, the gel electrophoresis of DNA fragments 
is compUcated by problems with gel distortion, such as band shifting that warrant internal size 
standards and bandmatching software (Lander. 1991, Am .J. Hum. Genet, 48:819-823). 
Crosstalk or interference during analysis between multiple dyes with spectral overlap is another 
potential problem when multiple PCR fragments of the same size are to be identified within the 
same gel lane. Since the processing of gels and the scoring of autoradiogiaphs remains the 
rate-Umiting step in genotyping, methods are being sought that improve the efficiency of sample 
handUng while minimizing errors in data transcription and analysis. 
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The chaUenge of mapping the major genes in complex disoiders requires efficient and 
highly accurate methods of genotyping. Recent technological enhancements in molecular 
genetics have significantly improved our ability to locate disease genes by linkage analysis. 
However, despite the introduction of molecular methods, such as PGR, and the discovery of 
highly polymorphic SSR, genotyping is still rate-limiting for localizing disease genes by linkage. 
The present methods remain highly technical, time-consuming, and expensive. 
SUMMARY OF THE INVENTION 

It is an object of this invention to provide a robust semi-automated protocol for 
genotyping using multiplex analysis of many microsatellite loci while maintaining, or improving, 
typing accuracy as compared to traditional methods. It is also an object of this invention 
to provide a collection of highly reproducible microsatellite markers at approximately 10-50 cM 
intervals throughout the human genome which can be detectably-labelled. 

It is a further object to provide protocols for the reliable use of these marker systems in 
automated goiotyping. 

To meet these and other objects, and to better exploit the inherent advantages of 
fluorescence-based genotyping techniques, this invention provides highly informative SSR 
markers, assembled into "SETS" that do not overlap in size when separated electrophoreticaUy 
on an acrylamide gel and that can be labeUed with different fluoiophores. Each SET contains 
6 or more pairs of primers that provide for amplification of markers (preferably 7-8 pairs of 
primers) that have been labelled with the same fluorophore having a distinct color, separate 
SETS having different fluorophore labels (e.g., blue, green, or yeUow). PGR products 
corresponding to these SETS are combined into a GROUP for electrophoretic analysis in a single 
lane. Using this methodology, a GROUP of 18 or more, preferably 21 to 24 dinucleotide 
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markers can be electrophoresed along with an internal size standard and analyzed simultaneously 
(multiplexing) in real-time for each individual studied. 

In particular, the invention provides a kit for use in automated genotyping within a 
population comprising four or more GROUPS, each GROUP containing at least three SETS, and 
5 each SET in turn comprising at least 6 labelled pairs of primers for amplification of DNA by 
polymerase cham reaction (PGR), the sequence of each primer pair corresponding to a portion 
of the unique genomic sequence of a microsatellite sequence (which is made up of a nucleotide 
repeat sequence flanked by unique sequences), the nucleotide repeat sequence being polymorphic 
witiiin the population. Amplification of DNA from a human sample by the polymerase chain 

10 reaction (PGR) primed with a particular primer pair amplifies the nucleotide repeat sequence and 
at least some of the immediately adjacent unique sequences of the microsatellite sequence to 
produce a PGR product identified with the primer pair. The distance in the genome between the 
microsatellite sequence amplified by one primer pair of the kit and the nearest other 
microsatellite sequence amplified by another primer pair of the kit is at least 2 centimorgans 

15 (cM) and no more than 50 cM. Each SET consists of at least 6 of the primer pairs, where the 
length of the segment amplified by a particular primer pair (its PGR product) differs from the 
length of PGR products from all other primer pairs in the SET by at least 5 nucleotides for 
tetranucleotide repeats, at least 6 nucleotides for trinucleotide repeats and at least 9 nucleotides 
for dinucleotide repeats. At least one primer of each primer pair is labelled with a fluorescent 

20 label that is tiie same for all primer pairs in tiie SET. Each GROUP consists of at least three 
SETS of primer pairs labelled with fluorescwit labels, and primers from one SET in the GROUP 
are labelled with a fluorescent label which fluoresces at a wavelength which is substantially 
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different from the wavelength at which the fluorescent labels on the primers in each of the other 
SETS in the GROUP fluoresce. 

Where the primers in a single kit cover the entire genome with markers spaced 
approximately 10 cM apart in the genome, the kit will usually contain at least about 10 
5 GROUPS . In another embodiment, a kit is provided for screening of the genome with individual 
markers spaced in the genome about SO cM from the nearest other marker in the kit, and the kit 
contains at least 4 GROUPS. The invention also provides kits containing fewer GROUPS with 
primers whose PGR products identify microsatellite sequences found in the genome spaced 
closely about the locations picked out by screening studies performed using the screening kit. 

10 

The invention also provides a method of analyzing genomic DNA for the presence of 
polymorphisms comprising: extracting DNA from a human sample; combining, in a polymerase 
chain reaction (PGR) vessel, an aliquot of the extracted DNA, at least one primer pair selected 
from one of the GROUPS described above, and PGR amplification enzymes; cycling the 
15 temperature of each PGR vessel to produce PGR products that can be identified with the primer 
pair whose sequence coixesponds to xmique sequence in the amplified DNA, using an annealing 
temperature at which non-spedfic annealing is minimized; then combining all PGR products 
from all PGR vessels containing primer pairs from a single GROUP into a mixture, and 

« 

subsequently separating the mixture of PGR products electrophoretically by size; and detecting 
^ 20 separated PGR products by fluorescence detection at wavelengths corresponding to the 
fluorescent wavelength for each of the fluorescent labels in the kit. In a preferred embodiment, 
one primer of each primer pair is labelled with a fluorescent label and the other primer in the 
pair is labelled with biotin, and a mixture containing all PGR products corresponding to the 
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primer pairs from a single GROUP is prepared by binding the PGR products to a plurality of 
paramagnetic beads carrying on their surface a protein which specifically binds biotin (the beads 
being added to each PGR vessel after amplification), separating the magnetic beads from the 
PGR reaction medium, then separating the two strands of the amplified DNA segments and 
combining the strands labelled with a fluorescent label for all primer pairs from one GROUP 
into the mixture. 

The invention also provides a method for selecting a SET of PGR primers for use in 
automated genotyping comprising selecting at least 6 microsatellite sequences, which contain di- 
nucleotide, trinucleotide or tetranucleotide repeat sequences that are flanked by unique sequences 
in the human genome, and are polymorphic within the population, the microsatellite sequences 
being separated from each other by at least 2 centimorgans in the genome, and for each 
microsatellite sequence constructing primer pairs having the sequence of the unique sequences 
flanking the microsatellite sequences, so that the primer pairs will direct PGR amplification of 
DNA segments corresponding to each microsatellite sequence and the length of all polymorphs 
of the microsatellite sequence amplified by a particular primer pair is detectably different from 
the length of all polymorphs of other microsatellite sequences amplified by other primer pairs 
in the SET. The invention also provides a kit for use in automated genotyping comprising at 
least 10 GROUPS of at least 3 SETS of PGR primers obtained by this method, and a method 
of analyzing genomic DNA for the presence of polymorphisms comprising amplifying DNA 
extracted from a human sample using PGR directed by these primer pairs to produce PGR 
products labelled with detectable labels that are die same for all PGR products from a single 
SET, followed by separating electrophoretically a mixture containing all PGR products amplified 
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from the DNA sample by any primer pair of said SET and characterizing the detectably labelled 
PCR products by length. 

The invention also provides a diagnostic method for detection by polymerase chain 
reaction of genomic rearrangement (including deletions, additions, crossovers and gene 
amplification), of a genomic region containing at least 6 known loci at which genetic 
rearrangement is diagnostic for a disease, using a kit comprising at least one SET containing at 
least 6 PCR primer pairs, the sequences of each primer pair corresponding to the unique 
sequences flanking one of the loci of genomic rearrangement. The primer pairs in the SET are 
constructed so that the PCR product amplified by a particular pair of primers corresponds to a 
DNA segment surrounding one locus of rearrangement with length that is characteristic of a 
specific rearrangement, and the length of the PCR products amplified by a particular pair of 
primers differs from the length of all other PCR products amplified by other primers in the SET. 
DNA from a sample is amplified in a PCR vessel using the polymerase chain reaction (PCR) 
primed with at least one of the primer pairs of the SET by cycling the temperature of the vessels 
with an annealing temperature that minimizes non-specific annealing to produce detectably 
labelled PCR products, and the PCR products for all primer pairs in the SET are detectably 
labelled with the same label. Labelled PCR products are separated electrophoretically by size 
from a mixture containing all PCR products amplified from the DNA sample by any primer pair 
of the SET, and the separated, detectably labelled PCR products are characterized by length. 
In a preferred mode, all primers in the SET have annealing temperatures within a 4C range, and 
amplification for all primers in the SET is carried out simultaneously in the same vessel. 

The inventor has created a kit comprising SETS of highly polymorphic fluorescent 
primers specific for microsatdlite markers that cover the genome at approximately 10 cM 
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intervals for linkage studies. A fluorescence-based protocol based on these SETS has been 
developed for detection of multiple microsateUite markers, and the protocol is accurate as 
compared to a conventional radiolabeling method that depends on a known DNA sequence ladder 
and conventional autoradiography for detection. It has now been demonstrated that genotyping 

5 by semi-automated fluorescence-based techniques is both highly accurate and efficient. We 
routinely type 24 fluorescent markers simultaneously using these techniques in my laboratory. 
The combined analysis of 24 dinucleotide markers in a single gel maximizes the use of 
automated analysis equipment, such as the AppHed Biosystems 373A hardware, by producing 
PCR products suffidentiy small to run the instrument at least twice daily. The methods 

10 provided herein may improve productivity by more than an order of magnitude and can be easUy 
adopted to most linkage studies. 
BKIEF DESCRimON OF THE FIGURES 

Figure 1 shows the genetic map of the chromosomal region surroundmg a putative 
GENEnC locus. In this example the greater the spacing between markers the more IDffily 

15 recombination will occur during meiosis. 

Figure 2 shows segregation data from a fabricated three generation family affected with 
a genetic disorder for the four markers illustrated in Figure 1. Squares indicate males, circles 
indicate females. Affected and unaffected famUy members are indicated by soUd and open 
symbols, respectively. Crossovers that have occurred during meiosis are indicated by the 

20 arrowheads. Recombination with markers 1 and 4 from chromosome A exclude a localization 
for the gene causing diis disorder in the region immediately above marker 1 and below marker 
4. The region ftom chromosome A between markers 1 and 4 Cmcluding markers 2 and 3) co- 
segregates with the abnormal phenotype in all the affected individuals in this family but is not 
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found in any unaffected individuals. These data confirm a localization for the GENEHC locus 
under study to this chromosomal region. 

Chromosomal r^on 4 of chromosome B from affected individual I-l occurs in both 
affected and unaffected offspring in generation H, showing no linkage. The markers used in this 
demonstration approach the ideal by providing maximal genetic information for every individual 
studied. 

Figure 3 illustrates the most common form of simple sequence repeat. In this individual 
the marker is heterozygous, or differs in the number of dinucleotides between the maternal and 
paternal chromosomes. These PGR products would differ in length by 8 nucleotides, and are 
each easay detected using gel electrophoresis. The solid bars indicate surrounding sequence that 
is unique (occurs only once in the human genome) and can be used to design PGR primers for 
amplifying this simple sequence repeat. 

Figure 4 shows a cartoon of GROUP 1 markers. Each simple sequence repeat marker 
is identified on the left, and the size range for known alleles are noted on the right. Each 
marker covers a region of a chromosome to be examined for linkage with a genetic disorder. 
The colored boxes refer to the region on the gel where alleles for each marker may be found. 
The marken are chosen to avoid overlap between these regions. For increased efficiency each 
SET is labelled with one of three fluorophores - yellow: tetramethyl-6-carboxy-rhodamine 
(TMR), blue: 5-caiboxy-fluorescein (FAM), and green: 2',7'-dimethoxy-4',5'-dichloro-6- 
carboxy-fluorescein (JOE); (red 6-carboxy-rhodamine (ROX) is reserved for internal size 
standards), AppUed Biosystems. The products of the PGR ampUfications are pooled and 
subjected to the electrophoresis together. Marker daU are derived from the Genome Data Base 
(GDB), The Johns Hopkins University, Baltimore, Maryland. 
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Figure 5 shows a typical set of electrophoretogiams for GROUP 2 using DNA from a 
single individual. 

Figure 6 shows an electrophoretogram of SET A, GROUP 1 markers from one 
individual. The size (nucleotides) of each PGR product is given on the X-axis above the 
5 electrophoretogram. 

Figure 7 A-M provides a listing of the markers in 13 GROUPS each containing 16-24 
markers divided into three SETS. The first column gives a locus designation for the marker to 
' identify the entry in the Genbank Data Base which provides the unique sequences surrounding 
the markers. The xmique sequence information can be used to design primers that will direct 
10 PGR amplification of the marker. After the locus designation, the size range of the published 
alleles (in base pairs), the degree of heterozygosity in the population and the chromosomal 
location are listed, in that order, for each marker followed by the nucleotide sequences of 
preferred primer pairs, along with their annealing temperatures and preferred choice for labelled 
primer. 

15 Figure 8 demonstrates the difference in autoradiographic image produced depending on 

whether the forward or reverse primer is labelled. 

Figure 9 shows an autoradiograph of PCR-amplified DNA using the primers of GROUP 
2, SET B, The variation in intensity in products of this SET is typical of this type of marker. 

20 Figure 10 shows the effect of varying the amount of paramagnetic beads in a magnetic 

bead-based recovery from PGR. 
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DETAILED DESCRIPTION OF THE INVENTION 

Methods for sequencing DNA, for synthesizing oligodeoxynucleoticies of defined 
sequwice, and for sq)aiating nucleic acid segments by molecular weight using, e.g., 
electrophoresis are well known to those skilled in the art and well described in the literature, in, 
for example, "Molecular Cloning: A Laboratory Manual," Sambrook, et al., eds., Cold Spring 
Harbor Laboratory Press, 1989. General methods of analyzing DNA by the polymerase chain 
reaction (PCR) including isolation and preparation of DNA templates, synthesis and labelling 
of primers, amplificaticn, and analysis of PCR products are also well known and described in 
the literature, for example in Sambrook, et al., 1989, or in "PCR Protocols: A Guide to 
Methods and Applications," Innis, et al., eds., Academic Press, 1990. The skilled worker in 
this art is femiliar with these and other methods of manipulating and analyzing DNA, and 
routine application of such methods within the skill of the ordinary skilled worker is assumed 
in the following description. 
Semi-Automated Genotyping: 

Despite the improvements in linkage techniques introduced by PCR and SSRs, genotyping 
remains highly technical, time consuming, and expensive. The application of fluorescence-based 
technology is one way to further reduce the cost and increase the efficiency of diis type of 
project. Fluorescent labeling of PCR-based markers provides many potential advantages over 
radio-labels (e.g., ^) and other labels in common use for PCR markers. Fluorescent labels are 
nontoxic, stable, and can be combined and analyzed together in a single electrophoretic lane 
(multiplexing) to provide a many-fold increase in efficiency over standard methods of detection. 
Fluorescence signals are linear over a much greater range of intensity than conventional 
autoradiography and other methods of detection in use, providing a better means of 
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distinguishing between aUeles and artifact Band intensity provides an objective method for 
distinguishing between aUeles and artifacts and may also provide a better means for identifying 
tile products of microsatellite markers that frequentiy vary significantiy in intensity. 

Ultimately, real-time fluorescence detection metiiods may provide a substantial increase 
in efficiency over standard metiiods of detection based on radiolabeling. A much larger range 
of product sizes can be resolved on each gel run as compared to radiolabeling techniques because 
wiUi flie automated, real-time equipment such as tiie AppUed Biosystems Inc., tiie PGR products 
pass by tiie detector toward tiie bottom of tiie gel where flie band resolution is greatest. 
Efficiency is further improved by tiie potential real-time semi-automated detection of alleles. 
In addition, internal size standards are easily incorporated for reproducibiUty and tiie accurate 
sizing of alleles, avoiding day to day vaiiabiHty. Computerized data acquisition and handling 
further aid productivity and reduce erron in data entty and manipulation. Ultimately, 
automation is likely to occur more rapidly witii fluorescence-based techniques tiien witii otiier 
methods of labeling and detection. 

As an initial test of tiie fluorescence technology, a shidy was conducted comparing tiie 
accuracy and reliabiUty of tiiese metiiods witii ^^P end-labeling (see Example 1). Three markers 
were chosen because tiiey produce PCR products of tiie same size range. Products of PGR 
reactions nm witfi primers complementary to tiie unique sequences on eitiier side of tiie SSR for 
tiiese markers were obtained usmg primer pairs in which one primer of each pair was conjugated 
to a fluorescent label. These PCR products were electrophoresed simultaneously in a single 
electrqphoretic lane to test if tiiese genotypes could be accurately determined. Similar to tiie 
report by Ziegd, et al., 1992, tiiere was no difficulty in discerning PCR fragments of tiie same 
size labelled witii different fluorophores. 
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Deteimining the size of DNA fragments accurately is critical to genotyping in a number 
of applications. When paioital aUeles are available, a simple comparison can determine which, 
if either, parental allele has been passed on to a child. However, frequently in linkage studies 
the parental alleles are not available for comparison, and paternity must be questioned. Tliis is 
also true in DNA forensics, where an unknown must be compared with many others and its size 
determined unambiguously. TTie analysis of PCR product that differ grossly in concentration 
is compUcated by bandshifting and other gel related artifacts. The accuracy of this typing 
procedure must be based on empiric studies of reproducibility using "known" samples as 
standards. Non-polymorphic internal size standards can be used to remedy these problems 
(Lander, 1991). 

Example 1 demonstrates the accuracy of sizing microsateUite PCR products using a 
fluorescence-based approach as compared to a conventional radiation-based method using a 
known sequence ladder. DNA templates may be obtained from the collection of Centre d'Etude 
du Polymorphisme Humaine, Paris (CEPH) for use as a standard set of aUeles to compare these 
techniques, because there is Httle question of the genetic identity of each of the individuals in 
this collection. To avoid ambiguity in genotyping with the fluorescent method, Actional size 
estimates should preferably be accurate to within 0.5 nucleotides. Variation greater than this 
could lead to confusion during band matching, after rounding up or down for size estimates 
provided as a ftaction of a nucleotide. Since our analysis suggests diat the maximum variation 
is likely to be less than 0.5 nucleotides (and generally significanUy less), the method will be 
useful in the intended applications. 

As shown in Example 1, no sizing enon occurred with the use of the multi-color 
fluorescence-based technique, showing that this methodology is highly accurate and reproducible 
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for scoring microsatellite markers. Since the only sizing error resulted from the use of the 
conventional radiolabeling technique, the fluorescence-based protocol appears at least as accurate 
as the conventional method. Therefore, this approach appears to adequately compensate for gel 
distortion and dye related artifacts as compared to radiation labeling techniques. 

Accordingly, the advantages demonstrated for fluorescence-based techniques may be 
exploited by the method of this invention, which uses at least 6 highly informative SSR markers 
assembled into a ladder which we have designated a "SET". Each SSR marker is characterized 
by PGR primer pairs which have the same sequence as a portion of the unique DNA sequence 
on the 5' side of the sense and antisense strands, respectively, encoding the repeat sequence at 
a particular point in the genome. When the genetic material of a particular individual is 
amplified by PGR using one of these primer pairs, a segment of DNA corresponding to the 
sequence of the particular SSR and its unique flanking sequences is produced (the PGR product). 
The size of the PGR product is dependent both on how much of the unique sequences are 
covered by the primes in the pair and on the number of times the rqjeat sequence is repeated. 
The number of repeats of the simple sequence at a particular locus varies between individuals 
Opolymorphism), and this polymorphism results in PGR products of varying size for different 
individuals. Thus the size of the PGR product can be used to determine if two individuals have 
an allele in common at the genetic locus of die SSR marker. 

The spacing in the gd between PGR products identified widi different markers is critical. 
By carefiiUy selecting the length of the primer sequences for each marker, die PGR products 
corresponding to each marker in a SET are spaced a critical distance from surrounding markers 
such that none of the PGR products for the largest known alleles of one marker overlap in size 
with PGR products for the shortest known alleles of another marker in the SET when separated 
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on a 6% denaturing acrylamide gel. An additional safety margin should be provided, because 
rare undocumented alleles (larger or smaUer) may occur for any given marker. Size spacing of 
less than 9 nucleotides between dinucleotide SSR markers increases the likelihood for overlap 
because 2-4 stuttering bands (each 2 nucleotides apart) below the smaUest allele of one marker 
may overlap with the largest allele of the marker below it. PGR products for trinucleotide 
repeat sequences and tetranucleotide repeat sequences are not observed to exhibit stuttering 
bands, so the minimum separation distance above and below the largest and smallest known 
alleles can be less for tri- and tetranucleotide repeats. Usually, PGR products for trinucleotide 
repeats in a SET will differ by at least 5 base pairs, and for tetranucleotide markers by at least 
6 base pairs. Preferably a SET wiU contain 7-9 SSR markers, most preferably 8-9 markers. 
The upper limit on the number of markers in a SET is dq)endent on die lengtii of the 
electrophoretic separation. 

The PGR product of each primer pair in the SET is tagged with tiie same label, 
preferably a fluorescent dye. UsuaUy a fluorescent label is covalenfly attached to one of the 
primers in a primer pair. Alternatively, the PGR product may be uniformly labeUed by adding 
one or more fluorescenfly-labeUed nucleoside triphosphates to the PGR reaction. Labelling of 
the primers may be accomplished by including a Huorescentiy-labeUed nucleotide during 
syntiiesis of the primer or by linking a fluorescent label to the primer after synthesis. 
Huorophore labels for attachment to nucleic adds, including PGR primers, are readily available 
in the art. (See, e.g., Nagaoka, et al., (1992) Chem. Pham. Bull, 4Q:2559-2561; Giusti, et 
al.. (1993) PGR Mer^^;;;,/., 2:223-227; Alexandrova, etal.. Nucleic Acids Symp. Ser. 1991, 
p. 277; Schubert, et al., (1992) DNA Seq.. 2:273-279; Vu, et al., (1990) Tetrahedron Lett., 
21:7269-7272.) Usually the labels contain coupling groups that react with modified nucleotides 
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of the PCR primers to fonn covalent links. Attaching such fluorophores to the primers in the 
SETS of this invention is easily within the skill of the ordinary worker. See, e.g., Levenson 
and Chang, 1990, "Nonisotopically Labelled Probes and Primers," in PCR Protocols, innis, et 
al,, eds., Academic Press, NY. Fluorescent labels with non-overlapping emission spectra are 
also available commercially, for example, from Applied BioSystems, Inc., including 5-carboxy- 
fluorescein (FAM-blue), 2',7'dimedioxy-4',5'-dichloro-6-carboxy-fluorescein (JOE-green), 
N,N,N\N'-tetramethyI-6-carboxy-rhodamine(TMR-yeUow),and6-carboxy-X-rhodan^ 
red); from Biological Detection Systems, Inc., Pittsburgh, PA (BDS) including nucleoside 
triphosphates coupled to cyanine dyes that fluoresce in the green or orange region, or Boehringer 
Mannheim Corporation Biochemical Products, Indianapolis, IN, including fluorescein-5(6)- 
carboxamidocaproxyl-dUTP (yellow), 7-hydroxy-coumarin-3-carboxyl-dUTP (blue), and 
tetramethykhodamine-5(6)-amino-thiono-dUTP (red). 

Additional suggestions for selecting labels with non-overlapping fluorescent spectra and 
derivitizing oligonucleotides, with them can be found in Smith, et al. 1986, Nature. 321:674- 
679, incorporated herein by reference. Alternatively, primers (or PCR products) may be 
labelled with biotin (see, e.g., Innis, et al., "PCR Protocols," Academic Press, NY, 1990, pp. 
100-103) and then streptavidin coupled to a particular fluorescent dye added to all of the PCR 
products of a particular SET. Variations of these labelling methods or similar methods known 
to those skilled in the art may be used, so long as all PCR product for markers in one SET are 
labelled with the same label. 

SETS, each labelled witii a different fluorophore, can be pooled into a collection of 
markers that we have termed a "GROUP. " The number of SETS in a GROUP will depend on 
the availability of distinct labels. PCR products for each SET in the GROUP will usually be 
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labeUed with fluorophores that emit Ught at a wavelength substantiaUy different from the 
wavelengths emitted by fluorophoie labels of the other SETS in the GROUP, where 
"substantially different" means sufficiently distinct to be distinguished by the detection means 
chosen for detecting PGR products after electrophoresis. For example, three commerdaUy 
available fluorophores, referred to as TMR, FAM. and JOE (AppHed Biosystems). have 
different colors which are yeUow, blue, and green, respectively. 

Using this approach we have analyzed as many as 24 SSR markers in a single 
electrophoretic lane using three distinct fluorescent labels to label three SETS in the GROUP 
(see e.g. Fig. 4). In a preferred mode, these fluorescent PGR products may be separated on an 
automated electrophoresis systems, such as the Applied Biosystems 373 sequencer with internal 
size standards in each lane (labelled, for example, with ROX (red dye), AppHed Biosystems) and 
analyzed using, e.g., GeneScan 672 software (AppUed Biosystems) (Ziegle, etal., 1991, Miami 
Short Rep., i:70) and scored using (^OTYPER software (Applied Biosystems), with data 
displayed as an electrophoretogram or in a spread sheet format. Gel band fluorescent intensities 
and peak areas provide an objective method of distinguishing alleles from artifect (stuttering 
bands). A typical electrophoietogram from a single individual for SET A GROUP 1 is 
illustrated in Figure 6. 

Marker Selection and Development: 

The human genome is estimated to be jqjproximately 3000 cM in length. Therefore, to 
adequately "cover" the entire genome at 10 cM intervals will require approximately 300 highly 
informative weU spaced markers. An alternative estimate obtained by summing the meiolic 
m^s fiom all the chromosomes suggests that the genome is approximately 5000 cM in length 
(NIH/CEPH Collaborative Mapping Group, 1992, Science, 252:67-86). Adequate "coverage" 
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of the entire genome based on this size estimate at 15 cM intervals (which would allow testing 
for linkage without using a prohibitively large number of families) will require about 333 highly 
informative well spaced markers. 

Characteristics of preferred markers can be summarized as follows: unique sequence 
surrounding the marker is available for use in designing primers, they have been sized 
accurately, the heterozygosity value is known, and each marker has been carefully localized. 
Over 1000 SSR markers, including the surrounding unique sequence and chromosomal location, 
have been described to date in the Genome Data Base (GDB), October 19, 1993, The Johns 
Hopkins University, Baltimore, Maryland. In contrast to older approaches, such as RFLP, many 
of the preferred SSR markers are heterozygous (alleles differ at a particular locus) > 50% of the 
time and therefore are highly informative for linkage studies. Each allele of the markers used 
in the method of this invention will be easily detectable after amplification by PGR as a 
predictable component of a complex image or signature by 5* end labeling with labeling 
with fluorescence, or by a variety of other methods. Most preferably, the markers also produce 
an easily scored product or simple pattern of stutter bands that are the signature of 
mononucleotide and dinucleotide repeats. 

Most dinucleotide repeats produce two or three smaller less intense products or "stutter 
bands" (Weber, 1989). These are artifacts produced during PGR, and are less common in PGR 
of tri-and tetranucleotide repeats. Although these stutter bands have been generally considered 
undesirable, they can be quite helpful to the investigator (or computer) during the scoring of 
genotypes by allowing for the identification of *false' bands (background bands due to non- 
specific annealing). Each allele can then be easily scored by 5' end labeling with or 
fluorescence after amplification by PGR, as a predictable component of a complex image. 
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Background bands are generaUy not associated with stuttering artifacts. Because artifacts due 
to nonspecific annealing are difficult to eliminate entirely ftom a PGR reaction, the adaptation 
of a similar protocol for the multiplex semi-automated genotyping of tri-, and tetianucleotide 
repeats may be more problematic. The metiiod of this invention reduces artifacts due to non- 
specific annealing by control of the annealing temperature for respective primers during 
temperature cycling. 

TTie use of dinucleotide SSR is preferred in the metiiod of tiiis invention, because tiie 
potential advantages for automated genotyping may not be so easUy incorporated into practice 
for mono-, tri- and tetranucleotide rqxats. PGR products of trinucleotide and tetranucleotide 
repeats lack tfie unique "stuttering" signature of dinucleotide repeats, making it difficult for tiie 
computer to distinguish real alleles from artifacts produced by nonspecific annealing during 
PGR. Altiiough a simple set of PGR products are produced as aUeles (littie or no stuttering) 
from tri- or tetranucleotide SSRs, it is often difficult to eliminate oflier PGR artifacts completely. 
These PGR artifacts are not easUy distinguished from "false" bands when large numbers of PGR 
products that vary significantly in intensity are combined as described by tiiis metiiod. The 
unique signatiire derived from tiie stuttering bands of dinucleotide repeats provides a simple 
means of distinguishing real products (aUdes) from artifactiial bands. 

Furthermore, tiie cost of tiie hardware is generally considered die Umiting factor when 
adopting die fluorescent approach. Tri- and tetranucleotide markers generally require a 
significantiy larger fraction of each gel because alleles span a much larger size range. Thus 
longer run time is required, and fewer markers can be resolved per gel. The cost of die 
hardware becomes readily affordable if one considers tiie utility and tiiroughput of such an 
instrument when used according to tiie metiiod of tiiis invention. However, tiie use of fewer 
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markers per lane (i.e., tetranucleotide repeats) would substantially reduce the cost effectiveness 
of the hardware by reducing efficiency. 

Finally, far fewer of tri- and tetranucleotide markers have beer. fiiUy characterized at 
present. Thus, the availability of well-characterized primers which can be assembled into SETS 
5 and GROUPS remains another limiting factor at present. 
Construction of Marker SETS: 

The selection of markers for inclusion in each SET is based on the need to: maximize 
heterozygosity values (genetic informativeness), place the marker within a SET based on the size 
of the PGR products (alleles produced must not overlap with those of the marker above of below 

10 it), and the location of the marker in the genetic map (ideally v/e would have 450-500 markers 
placed 10 cM or less ^jart). The PGR products corresponding to markers within a SET are 
sized to assure that infrequent alleles and stutter bands do not produce overlap between the 
markers (compare e.g., Figures 4 and 6). PGR products for SETS of dinucleotide markers 
differ by approximately 9 nucleotides, preferably, at least 10 nucleotides, in length. When 

15 necessary, new oligonucleotide primers based on the unique sequence surrounding a polymorphic 
marker are designed and synthesized to assure that the PGR products do not overlap during 
electrophoresis. 

Figures 7A-M show 289 SSR markers that have been selected and combined into 11 
GROUPS of 21-24 markers and 2 incomplete GROUPS of 16 markers so that markers in each 
20 GROUP can be separated and analyzed simultaneously. The selected markers cover the genetic 
map on average once every 10 cM. Most are heterozygous greater than 70% of the time. In 
a preferred embodiment, each SET is composed of 8 markers from multiple linkage groups (see, 
e.g., Figure 7B-H). Most preferably, SETS of markers are part of a single linkage group (i.e. 
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a smgle chromosome), but this may require significant additional labor because fewer existing 
primers will be suitable. 

Additional or alternative SSR loci to assemble into GROUPS of markers may be found 
in GDB. Loci listed in GDB can be arranged on the genetic map by using map location 
information in GDB. Additional or alternative primers may then be designed using information 
on the surrounding DNA sequence available in Genbank, based on the locus designations Irom 
GDB. GROUP 1 markers (Figure 7A) are currently performing well in multiple laboratories. 

In many cases new oligonucleotide primers must be designed ftom the sequence 
surrounding each marker to produce PGR products that fit between the products of the markers 
above and below it without overlap. The new primers can readily be designed from the known 
sequence surrounding the SSR. Criteria for selecting a sequence to be synthesized as a PGR 
primer are well known (see, e.g., Sambrook, et al., and Innis, et al., especially p. 9). 
Preferably, the unique primer 3' sequence should contain at least 7 nucleotides, the A G 
threshold should be at least -1.0 kcal/mol, most preferably -1.4 kcal/mol, and duplex formation 
should be avoided, the maximum length of duplex not exceeding 2 base pairs. The sequence 
of preferred primers will also minimize or eliminate self-complementarily, hairpin formation, 
and false priming. Once the sequences of candidate primers are chosen, synthesis is readily 
accomplished by standard methods (see, e.g., Sambrook, et al.). 
Optimization of PGR Conditions and Appearance on tlie Gel: 

These new primers must be tested to assure that they produce an easily scored collection 
of products of the correct size. Scoring may be easier if the label is on one primer rather than 
the other for particular markers (see, e.g.. Figure 8). Primers developed for dinucleotide 
markers may perform well in the PCR reaction, but produce products unacceptable for 
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genotyping (single base stuttering bands, stuttering bands of equal intensity with true alleles, or 
stuttering bands that are larger than the correct allele), and such primers should be avoided. 

For best results, the PGR conditions for each marker should be optimized to eliminate 
any artifactual PGR products due to nonspecific aimealing that may complicate the analysis of 
5 a GROUP of combined markers. In particular, the temperature of the annealing phase of each 
PGR cycle should be optimized for each primer pair. Accordingly, the annealing phase 
temperature is set relatively high, so that specific hybridization occurs, but non-specific 
hybridization between the template DNA and the primers is minimized. Usually, the selectivity 
provided by this optimization is preserved in the method of this invention by limiting the number 

10 of primer pairs in any PGR reaction vessel to those whose optimized annealing temperatiwe is 
the same or nearly the same. Preferably, all primer pairs in the same PGR vessel have 
annealing temperatures within 4G of each other. At one extreme, an entire 96 well plate is 
dedicated to PGR reactions using primers for a single marker. (When genotyping is preformed 
for a large number of individuals, using a separate plate for PGR reactions for each marker will 

15 not reduce efficiency.) Alternatively, each PGR vessel on a plate has only one primer pair, but 
the plate contains vessels having different primer pairs, so long as all primer pairs on the same 
plate have annealing temperatures within 4G* In a preferred mode, all of the primer pairs for 
a SET or even a GROUP are constructed to have optimized annealing temperatures in a narrow 
range, most preferably 4**G, and all of the primers are present in a single PGR reaction vessel, 

20 obviating the need to mix the individual PGR products prior to electrophoretic separation. 

In addition, each marker should be evaluated to assure it is sized correctly within the SET 
and that the alleles can be easily scored as distinct products. Furthermore, reported 
heterozygosity values are usually verified using a population of unrelated individuals. The same 



SUBSTITUTE SHEET (ROl£ 26) 



wo 95/15400 



PCT/US94/13945 



-27- 



DNA templates provided herein may be used as controls for verification of protocols and quality 
assurance. Preferred controls include CEPH parents (BIOS corporation, New Haven, Conn.; 
CeU Repository, Camden, NJ.), such as families 1331, 1347, 884, for which reference alleles 
are known (see, Weber, et al., and Genethon Microsatellite Map Catalog, Genethon Human 
Genome Research Center, Evry, France). Pooled DNA from volunteers who have donated 
blood that has been purified as described in the EXAMPLES may be used as well. 

This optimization process requires the synthesis of oligonucleotide primers, dilution and 
aliquoting of primers, identification of the appropriate annealing temperature (T^ and PCR 
protocol, electrophoresis of the products, autoradiography and data analysis. If labelled primers 
are used for detection of products, 5' end labeling of both primers should be tested to determine 
which one produces the best imaged The size of the PCR products from each marker should 
be verified experimentally to assure that it does not overlap with the products of the surrounding 
markers in the same SET. As a control for this purpose, PCR products from a pool of DNA 
samples from a population of unrelated individuals may be electrophoresed against a DNA 
sequence ladder. In a preferred mode the test pool will contain at least 50 chromosomes. 

Initial characterization of primers for each SSR marker may be performed with "P labels 
because this is less costly, but the smooth adaptation of fluorescent-based techniques for 
genotyping with markers that have been optimized using is also dependent on assuring the 
PCR products labelled with a fluorescent dye perform as expected during PCR and analysis. 
Therefore, the reliability of the developed protocol should be checked by electrophoresis of 
DNA samples labelled by PCR with the fluorescent labels. 



Frequently the image produced by labeling one of the pair of primers is blurred, see, 
e.g., Figure 8. 
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The PGR products of different microsatellite markers frequently vary significantly in 
intensity (see, e.g., Figure 9). The sizing of fluorescent PGR products of grossly different 
concentrations is potentially complicated by sample overloading, causing spectral interference 
between the dye labels during analysis. There was no interference in the detection of the 
overlapping products using the four dyes in Examples 1 or 5, because the concentration of each 
PGR product was determined and adjusted to prevent overloading. However in our experience 
this can become a problem when working routinely with 21 to 24 pooled markers. 

Overloading can lead to artifacts that become especially troublesome when they are 
interpreted as internal size standards. To prevent the inaccurate sizing of the products by the 
GeneScan 672 software, we have found that the selection of the standard peaks must be carried 
out manually. During large scale applications, such as in our linkage studies, this may become 
a serious problem. Moreover, it is often impractical to estimate the concentration of each of the 
fluorescent products in order to adjust the concentration of the individual samples to be pooled. 
Generally adjustments in the volumes for each marker can be made for all the samples by 
estimating the relative intensity of the marker within a SET. This is easily accomplished by 
referring to the data table of fluorescent band intensities or by viewing the electrophoretogram 
directly. 

In a preferred mode, PGR products are recovered and combined into a mixture containing 
the GROUP by a simple protocol that uses magnetic separation technology to purify the 
fluorescent PGR products and which restricts the total amount of product pooled to prevent 
overloading. Magnetic separation provides simple separations based on specific binding 
interactions without the need for expensive centrifuges. Saturation binding to a limited amount 
of paramagnetic beads can be used to control the amount of labelled PGR product carried 
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forward in the analysis. Relative intensity may be adjusted by this means and overloading may 
be avoided. 

In a preferred embodiment, one primer is labelled with a component that will bind to 
magnetic microbeads, for example biotin-labeUed primers will bind to stteptavidin-coated 
magnetic beads. Methods for labelling primers with biotin are taught in, e.g., Innis, et al., 
"PCR Protocols," 1990, pp. 100-103 and references cited therein. Magnetic beads coated with 
streptavidin are commercially available (Dynabeads™) and procedures for sq)aration are 
described in, e.g.. "Magnetic Separation Techniques Applied to Cellular and Molecular 
Biology," Kemshead, etal., eds., Wordsmiths' Conference PubUcations, Somerset, U.K., 1991. 
A fixed amount of magnetic beads are added to the PCR reaction after amplification using 
primers Uiat will bind to the magnetic beads. The magnetic beads witii Uie PCR product 
attached are separated from tiie remainder of the PCR reaction mixtiire, including salts and 
unused, detectably-labeUed primer, and flien die PCR product is recovered from die magnetic 
beads (for example, by separating the strands, leaving one strand attached to flie bead and 
recovering die other strand whose primer carries the detectable label). 

Alternatively, the entire PCR product may be labeUed by including biotinylated UTP in 
die PCR reaction medium as described by Dennis, et al., 1990, in "PCR Protocols," Innis, et 
al., eds. The PCR product can be bound to die beads for purification from die PCR reaction 
mix and excess primer, and subsequentty recovered from die beads by, for example denaturation 
of streptavidin. In anodier alternative mode, paramagnetic beads which have attached to tiieir 
surfeces single stranded DNA corresponding to a part of die sequence of die PCR product may 
be added to die PCR reaction mix at die end of amplification, followed by cycling above die 
melting temperature, reannealing and dien separating die paramagnetic beads and any odier DNA 
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strands annealed to the beads from the reaction mix. LabeUed strands can then be recovered 
from the beads, as above. 

Selection of SETS and GROUPS of fluorescent SSR markers covering the human genome 
(approximately 300) can be completed in approximately 6-9 months, using the piocedures 
provided herein. Prefer^ly, additional fluorescent markers will be developed (approximately 
500 SSR markers) providing a higher resolution tool for gene mapping. The resolution of this 
marker collection will approach 10 cM and will preferably cover the telomeres which will better 
assure linkage detection in complex non-Mendelian disorders like asthma and diabetes. 

The development of a common index set of fluorescent marlrers that can be used in 
multiple laboratories simultaneously should provide certain advantages in genomic studies. 
Typing these common index loci in a number of different populations afflicted widi the same 
disorder will facilitate the comparison of linkage results and provide the information required 
for the eventual application of these techniques to forensic medicine. 

The method of this invention offers several significant advantages over a similar strategy 
adopted by Diehl et al., 1991, Am. J. Hum. Genet., £Z:177. Spacing markers in a SET 
according to this invention avoids overlap, providing improved discrimination among markers 
and between markers and artifacts. As many as eight or more markers may be incorporated into 
a SET. When necessary, new oligonucleotide primers based on the unique sequence surrounding 
a polymoiphic marker can be designed and synthesized as taught herein to assure that the PGR 
products do not overlap during electrophoresis. Errors introduced by sample handling may also 
be minimized by storing DNA from each individual to be studied in a 96-weU format. Our 
protocol preserves the integrity of a 96 weU format including PGR amplifications, product 
pooling, and sample purification, thereby minimizing sample handling and errors introduced by 

SUBSTITUTE SHEET (RULE 26) 



wo 95/15400 



PCT/US94/13945 



-31 - 



excessive sample manipulations. In a preferred mode, efficiency is further aided by the transfer 
of a row of samples by multichannel pipette. 

The combined analysis of multiple markers maximizes the use of the Applied Biosystems 
373 sequencer or similar automated analysis hardware. Since the capacity of the 373 sequencer 
is 36 lanes per gel, 864 genotypes (1728 alleles) can be analyzed routinely from one gel using 
the semi-automated method of this invention. A typical linkage study would include about 100 
families or about 500 individuals. For a 5-year study including about 300 markers, 
approximately 180 gels, or about 3 gels per month, will be required. By using the method of 
this invention, at least 2 gels per day can be run per 373 sequencer. Thus, up to 12 
investigators can be accommodated on one instrument, which substantially reduces the cost per 
investigator. 

The method of tiiis invention can also increase the efficiency of diagnostic studies of the 
genome, when the desired diagnostic procedures involve the detection of genetic changes that 
affect the length of genomic DNA at 6 or more locations. Such changes include additions, 
deletions, intra-and interchromosomal crossover, gene amplification and similar gene 
rearrangements. The loci of many such rearrangements are known and associated with many 
diseases, especially cancers and metabolic errors inherited recessively. PGR using primer pairs 
which direct amplification of a DNA segment including one of these loci can be used 
diagnostically where the rearrangement associated with the disease causes a change in the length 
of the PGR product, A SET of primers designed according to the principles above can be used 
in the production of PGR products thzi can be analyzed electrophoretically in a single lane, for 
more efficient use of electrophoresis and analysis equipment. 



SOBSmUTE SHEET (ROLE 26) 



wo 95/15400 



PCT/US94/13945 



-32- 



EXAMFLES 

The foUowing examples describe particular embodiments within the broader invention, 
niese embodiments are described for illustrative purposes only, without intention to limit the 
invention. 

EXAMPLE 1 

As an initial test of the fluorescence technology, a study was conducted to compare the 
accuracy and efficiency of these methods with a conventional radiation-based method, nuee 
microsateUite loci producing PGR products that overlap in size were chosen to compare the 
accuracy of genotyping by fluorescence versus radiolabeling. Discrepancies between the 
genotypes derived from each technique were resolved by repetition. To estimate the variation 
in sizing of the fluorescence-based technique certain samples were loaded on 3 or more gels for 
comparison. DNA from CEPH (Centre d'Etude du Polymorphisme Humaine, Paris) femilies 
884, 1331, 1332, 1333, 1362 were amplified for Marshfield markers, mfd 1 (176-196bp), mfd 
59 (175-195bp), and mfd 154 (186-204bp) using the polymerase chain reaction (PGR). 

Fluorescent techniques: The forward and reverse primers were each labelled at the 5' 
end for detection by autoradiography with [»^] 7ATP(6000 Gi/Mmole) using polynucleotide 
kinase. A primer was selected from each marker for fluorescent labeling on the basis of the 
image of the products (see Figure 8). The optimal amiealing temperature was selected for each 
marker empirically by selecting a temperature that eliminated nonspecific amiealing or artifactual 
(background) PGR products. Huorescent labels were attached at the 5' end via phosphoramidate 
derivitization using Aminolink 2 (Applied Biosystems) . Primer B (see Figure 10) for mfd 1 was 
labelled yellow (TMR), primer A (see Figure 10) for mfd 59 was labeUed blue (FAM), and 
primer B (see Figure 10) for mfd 154 was labeUed green (JOE). PGR conditions were: 0.4 
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primers, 1.5 /xM MgQj, 50 ^ Kcl, 200 dNTPs and 0.5 units Tag polymerase (final con- 
centrations); 94°C for 10 min; followed immediately by 30 cycles of 94 *C for 30 sec; 58^*0 
(mfd 59, mfd 154) for 30 sec or 60°C (mfd 1) for 30 sec; and 72*C for 30 sec; followed by 
72**C for 7 min. PGR was carried out in a volume of 12,5 /xl using 25 ng of CEPH DNA, 
CEPH DNA was stored in a 96 well microtiter plate (Perkin Elmer/Cetus). Amplifications were 
performed in 96 well microtiter plates using a Perkin Elmer/Cetus Model 9600 thermalcycler 
and accessories, maintaining the integrity of the 96 well template. Five microliters were 
combined from each marker for each CEPH individual using a multichannel pipette 
(Transferpette-8, Brinkman), The pooled PCR products were desalted by adding 2 volumes of 
sterile deionized distilled water (ddHzO), ice cold ethanol (100%) equal to the total volume, and 
chilling for 30 minutes at -70X. The microtiter plate was spun at 4°C at 1400XG for 2 hours 
in a Beckman Model GS6R centrifuge. The supernatant was aspirated, the pellet was washed 
once with 1.5 volumes of ice cold ethanol (70%), and the plate centrifuged 30 minutes at 
1400XG at 4°C. The supernatant was aspirated and the plate was air dried. Pellets were 
resuspended in a volume of sterile ddHjO equal to the starting volume (pool). 

Radiolabelled products were sq)arated by conventional electrophoresis and scored 
manually from autoradiographs. Fluorescent PCR products were separated on a 373 sequencer 
with internal size standards in each lane (GeneScan 2500-ROX; Applied Biosystem) and analyzed 
using GeneScan™ 672 software (Applied Biosystems). Each sample (representing 0.5 fil of each 
product) was heated to 99**C after adding 1 /U of the internal lane size standards (GeneScan 
2500-ROX, Applied Biosystems) and 2 fd formamide/EDTA loading buffer, until the total 
volume was reduced to 2-3 plI Electrophoresis was carried out using 6% acrylamide (Biorad), 
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8 M urea (Ultrapure, USB) gels in 1 X TBE. The reduced volume was loaded and run for 4-8 
hours on a model 373 Sequencer (AppHed Biosystems) using a 24 cm well to read distance. 

The size of the PGR product is determined by reference to the internal lane size standards 
(Carrano et al. 1989, Genomics, 4:129-136). The size standard ROX-2500 (Applied 
Biosystems) including fragments: 37, 94, 109, 116, 172, 186, 222, 233, 238, 269, 286, 361, 
and 479 nucleotides in length was used with modifications. PGR fragments 61 and 68 
nucleotides in length were gel purified, labeUed by aminolinking with ROX, and added in equal 
volumes to the ROX-2500 standards. These fragments were added because desalting by ethanol 
precipitation recovers the unused PGR primers with the products. The intense peak produced 
by the unincorporated labeUed primer is seen in the standards because of interference between 
dyes and obscures the detection of the 37 nucleotide standard fragment. Therefore, we have 
modified the GeneScan-2500 standards to provide a firagment of known size labelled with ROX 
to accurately estimate the length of the smallest alleles. 

The GeneScan 672 (version 1.0) software recognizes any peak labelled with ROX, 
computes a caUbration curve based on a second-order least-squares fit, and uses these data to 
estimate the allele sizes of the PGR products (Ziegle et al. 1992). Data from each lane can be 
analyzed independently, or four lanes of data for a single fluorescent dye can be displayed 
simultaneously to compare individuals within a fanuly. AUele sizes in nucleotide bases, the 
genotypes, are assigned by interactively distinguishing major peaks from background artifacts. 
The scale on the display can be adjusted to analyze peaks with differences in fluorescent 
intensity. The intensity of each fluorescent band and peak areas provide an objective method 
of distinguishing alleles from artifect (including stuttering bands). Allele sizes can be transferred 

to a spreadsheet database for linkage or a multicolor electrophoretogram. 
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mfd 1, mfd 59, and rafd 154 PGR products overlap in size (175-204) bp (see Figure 10). 
mere was no evidence of interference between the dyes even when there was complete overlap 
during the electrophoresis of PGR products, similar to that reported by Ziegd et al., 1992. In 
our experience, interference between dyes does become a problem with overloaded samples. 
A comparison of the genotyping results of the radioactive and fluorescent labeling methods 
revealed 4 discrepancies out of 462 possible comparisons (aUeles) (see Table 1). One 
transcription error occurred in the manual data manipulation of the fluorescently labeUed 
products, mere was no interference between fluorophores with the detection of the overlapping 
products using the four dyes. No sizing errors were attributed to the fluorescence-based 
technique and each marker displayed Mendeiian inheritance, me average size variation across 
aU comparisons was 0.28 nucleotides. However, the maximum difference (range) fomid for any 
of the 462 comparisons was 0.47 nucleotides (see Table 2). GeneraUy sizing varied less within 
a gel than between gels, me variation in the size of the aUeles was similar when comparing 
each of the individual markers. The remaining discrepancies occurred with the use of the 
standard radioactive-based protocol and represented an error rate of less than 1 Inaccurately 
sized PGR products and sample misloadings produced mistypings with the conventional 
technique (^ Table 1). In general, fluorescent internal size standards provided more precise 
sizing than did radiolabeling. mese data demonstrate both improved accuracy and efficiency 
for typing SSR markers with use of fluorescence-based techniques. 
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TABLE 1 



CEPH 
DNA/Marker 


Genotype 
Radiolabelled 


Genotype 
Fluorescaice 


Explanation 1 


884-18/mfd 1 


178,192 


178,194* 


Size estimate error H 


1331-16/mfd 59 


179,179 


179,185* 


gel loading error 


1331-17/infd 59 


179,170 


179,185* 


gel loading error 


61332-15/mfd 154 


185,200* 


200,200 


recording error 



* indicates correct score by length in nucleotide residues 



TABLE 2 



COMPARISON 


RANGE 


(in nucleotides) 




Maximum 


Average 


Standard Deviation 


intergeF' 


0.47 


0.28 


.08 


intragel"' 


0.42 


0.18 


.07 


mfd 1^ 


0.35 


0.19 


0.1 


mfd 59"^ 


0.37 


0.15 


.08 


mfd 154"' 


0.42 


0.23 


.06 











Superscripts indicate number of samples 
EXAMPLE2 



Mapping with Ruorescent Primprni 

Genomic DNA is isolated as described by M.J. Johns, et al., Amfyrical Biochem., 
ifiQ:276-278 (1989). 

To minimize sample handling, DNA templates can be stored in a 96 well grid (e.g., 
Perkin Ehner/Cetus). The integrity of the grid may be maintained throughout the protocol to 
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avoid errors introduced by manual pipetting and sample handling. Multichannel pipetting from 
a 96-well grid expedites sample handling while minimizing human errors. 

PGR is performed in a reaction volume of 12.5 fd, containing 50mM dATP, dGTP, 
dTIP, dCn>; 0.07;.M of the labeUed oUgonucleotide primer, and 4 of the unlabeUeci 
primer. Taq polymerase (Perkin-Elmer\Cetus) 0.5 units is added on ice. PGR wiU usuaDy be 
performed in a thermalcycler. e.g., a Perldn-EImer\Cetus 9600 thennalcycler. Standard 
thermalycycler settings are 94-C for 10 minutes," followed by 30 cycles 94«C for 30 seconds. 
30 seconds at average annealing temperature for the primers and 72-C for 30 seconds; final 
extension is at 72°C for 7 minutes. 

labelled PGR products are purified by av-predpitation in EtOH. 24 markers may be co- 
precipitated simultaneously in the 96.weU format using ethanol. Ethanol precipitation desalts 
theproducts but copurifies the primers. The labelled primer peak produces an enormous signal 
that «>mpucates the analysis of products under 93 nucleotides in length because it interferes with 
the 37 nucleotide ROX GeneScan-2500 standard. As an alternative, internal standards may 
incorporate fragments that are 50, 60, and/or 70 nucleotides in length in addition to the 
GeneScan 2500 standard fragments or an equivalent set of fragments. 

The amplified products are analyzed by denaturing gd electrophoresis (Sambrook, et al.). 
I^g buffer (2X concentration) is added to an equal volume of the PGR reaction, and the 
PGR reaction is loaded on a 6% polyacrylamide gel. Radioactive products wiU be sized against 
a sequence ladder; the gels are dried and then exposed to Kodak XAR film for 4-24 hours with 
or without intensifying screens. Fluorescem labeUed PGR products may alternatively be 
analyzed by semi-automated detection using, e.g., an ABI 373A automated sequencers and 
GeneScan 672 software from AppUed Biosystems, Inc. 
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EXAMPLE3 

PGR products are produced as in Example 2 and then purified and combined for 
electrophoresis using a magnetic bead protocol in place of EtOH precipitation. One of each pair 
of primen is labelled with biotin and the other with a fluorescent label as above. Double 
stranded PGR products are purified using streptavidin conjugated to paramagnetic beads to bind 
the primer 5' labelled with biotin. This procedure may be easily adapted to the 96-well format 
in any laboratory without expensive centrifiiges. After the DNA bound to magnetic beads is 
sq)arated from the PGR reaction media, the two strands are melted and separated, and the strand 
labelled with the fluorescent primer is pooled with other labelled strands of its GROUP for 
electrophoresis. The result of increasing the amount of beads used for separation of a single 
PGR product from its PGR reaction mix is shown in Figure 12. 
EXAMPLE4 

^P OPTIMIZATION OF PRIMER SETS 
DNA Templates 

CEPH parents and/or unrelated volunteers as controls may be tested. In addition, we 
usually include one "no DNA" control and one reference individual (aUeles known) on each 
plate. To maximize the use of resources, each marker may be optimized, using 12 wells or less 
of a 96-well plate. Eght markers are ampHfied per plate at a single temperature. Alternatively, 
a thermalcycler with a smaller sample capacity may be used. 

The 5' end of the primers to be tested is labelled with ^P using the polynucleotide kinase 
reaction. Mix Sfi sterile ddHjO, 2.8 fd 5x kinase buffer (250 mM Tris, 50 mM MgGl2, 50 mM 
DTI, 0.25 mg/ml BSA), 6.0 ;il 10 ^M primer, 0.8 fil T4 polynucleotide kinase, and 3.0 fd y^^P 
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AIP (6000 Ci/mmol). Incubate at 37' for 1 hour, then add 26 ,1 sterile ddHA spin through 
select D column (FivePrime Il.ee Prime) loaded with P4 Biolgel (BIORAD) according to the 
manufacturers recommendations. The labeUed primers may be stored at -20'C. 

For optimization, set up simultaneous PGR reactions as described in Example 2. using 
DNA templates described above (e.g., 2 CEPH (1331-1, 1347-02). 1 pooled sample (50 
chromosomes), 1 no DNA). Perform PGR at the annealing temperature (T) calculated as 



follows 



T" - 2(A+1^ + (G+C) af the calculated temperatures for 2 primers differ greatly, for 
example 54« and 64% begin closer to lower T) 

Check the amplified PGR product for artifect by electrophoresis on 6% gel. Gontinue 
optimizadonoftheselected«P-labeUedprimerwithcontrolindiWdual,m^ 
temperature in r increments until nonspecific products are eUminated. On average, 
determinations at approximately 4 T- values are required to optimize each primer. 

When aU markers from a SET are optimized (usuaUy 8 markers), 3 ^1 from a pool of 
PGR product of DNA from unrelated individuals uang primers for each marker in the SET is 
combined with an equal volume of loading buffer (2X concentration). Seven ,1 (or maximum 
well volume) of the combined mixture is loaded on a gel and electrophoresed. Tins last check 
on size and product intensity assures that the markers are robust and are spaced about 10 

nucleotidesapart. ^I»imer sequences may then be used to synthesize fluorescent/biotinylated 
products. 

EXAMPLES 

A protocol extending this approach to include up to 24 microsatellite markers in each 
electrophoretic lane was tested as foUows. Tkc selection of markers was based on the need to: 
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maximize heterozygosity (genetic infonnativeness), distribute marlcers across the entire genetic 
map, and the placement of the marker within a SET based on the known size of the PGR 
products (aUeles and stuttering bands produced must not overlap with those of the marker above 
of below it). 

Highly informative microsatellite markers were assembled into a ladder or "SET". Each 
marker in a SET is spaced a distance of at least 9 nucleotides from surrounding markers such 
that none of the PGR products overlap in size when separated on a 6% denaturing acrylamide 
gel. Since many dinucleotide repeats produce a complex pattern of 3 or more stutter bands, this 
spacing is critical to assure that more intense stutter bands from an upper marker will not be 
misinterpreted as a product from a lower marker. In addition, new alleles both larger and 
smaller than the reported product sizes for this type of marker have occasionally been 
discovered. Each SET was labeUed with one of three different commercially available 
fluorophores (TMR, FAM, and JOE; AppUed Biosystems). The fourth fluorophore (ROX) was 
reserved for the internal size standard. Three SETS each labelled with a different fluorophore 
were pooled into a collection of markers we have termed a "GROUP". 

New primers were designed as necessary using OUGO 4.0 (Research Genetics, 
Huntsville, AL) to fit widiin the marker ladder. Each GROUP was constructed to avoid overlap 
between markers within SETS but to allow overlap between SETS. 

The autoradiographic image produced by many markers varied depending on whether the 
forward or reverse primer was labeUed (see Figure 8). Therefore, both primers from each 
marker were evaluated for image clarity and the abiUty to distinguish the most intense product(s) 
or alleles. The appropriate primer was then selected for further use. Optimization of the PGR 
conditions for each marker was also accomplished using radiolabeling. The strategy of 
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0 



ft.a™aft,g.e^„,^a,»„wte™ch,^fe^. Thu. pcr e™diao» as 

»n^»re was ^„ each warier. GROWS 1 and 2 have 6 and 9 di8c:,„, a™.3H»8 
.en,pe«n«,,,esp«>ivel,(«eFi|^7Aa»dB). A„e«dre™c»d.erptoeon«.gDNA 
ftom a .unto of dilf«eM individuals ™u usually b« a>„pafled for a siv„ ^ 

«mpe™«a.adMe.„,hi,should«r^u«».<„e«Uemcie„e,of«»^ p„ 
audies wiU, samples a d^alcycler blodc may be used «id, a lower capadiy. 

Variabilis, ^ a^ej* „pe,ad„g ^ ^ ^.^^ ^ 

a-eali.g,empe,a».re„he.swi«,,fe„^^^,^ TW,re d« use o, 
proBcols described for marker GROUPS 1 and 2 s Wd be prec^icd by a reevaluado. of *e 
suggest annealing .empera^cs for opdmal pertaance. TOs can general,, be earned ou. 
once on a few marker and when necessary d« amiealing «mpe.an.res can be adjust up „ 
down lor an the marters for that macbine. 

TlKtatcnsiQ-ofheproducBvariedconsiderablyftommarkertomarker. Whenmarte 
were radiolabelled and a SET ™ run «, .he same gel, de^dng aU Of .he produce on *e ge, 
wM. a Single flta exposure was often impossible. Attempt B score on a single gel d„ i«ger 
produc,sineacbSBrusbg.adioa«iv.bas^,e=bni,^^^^ Al^gb gradient 

ids improved the ba^i spacing, a maximum of 4-5 markers could be resolved per gel on 
^.loradiographs. An.uto.adiogn^hofGROW2SErBisshowninF.g„re9. TlKrangeof 
totensity - P^ucts of this SET is typical of du. ^ ^ 

autoradiographsarere^forgenotyping. These problems are partially overcome by d« use 
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of fluorescent labels (Ziegle et al., 1992). Huorescent signal detection is linear over a greater 
range, so that the markers with the weakest product intensity are more readUy typed in real-time 
along with the most intense products from other markers. 

Marker GROUPS 1 and 2 are described in Figures 7A and B, respectively. The primers 
sequence, chromosomal location, choice of labelled primer, and optimal annealing temperature 
is listed for each locus. GROUP 1 is composed of a combination of 21 di-, tri-, and 
tetranucleotide markers from multiple linkage groups. The product sizes range from 66 to 322 
nucleotides. Group 2 is composed of 24 dinucleotide markers with products ranging in size 
from 75 to 349 nucleotides. The mean heterozygosity for both GROUPS is 74%. 

Scoring of the fluorescent products using the ABI 373 sequencer and GeneScan 672 
software was unambiguous in samples that were desalted by ethanol precipitation. Desalting was 
carried out as foUows: 5 fil of each PGR product from the same SET (like color) was combined. 
Then 1.0 /J per marker per SET was combined for each of the 3 SETS giving a final volume 
equal to the total number of markers in the GROUP. Sample handling was otherwise exacdy 
as described above for the individual fluorescent marters. 

A typical set of electrophoretograms of each SET from GROUP 2 for a single individual 
is illustrated in Figure 5. Each of the alleles can be easily recognized by the unique signature 
of the stuttering bands for these dinucleotide repeat markers ampUfied by PGR. Samples that 
were not desalted were difficult to score because the mobilities of the products and the 
ROX-2500 internal lane standards were altered. Salt and primer loads become a problem when 
combining multiple products for electrophoresis because the necessary volume reduction results 
in sample concentration. The salt concentration rises with the product concentration and 
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21 to 24 markers. 

It MdU be understood that while the mvention has been described in conjunction mth 
specific embodiments thereof, the foregoing description and examples are intended to illustrate, 
but not limit the scope of the invention. Other aspects, advantages and modifications wiU be 
appar«,t to those skilled in the art to which the invention pertains, and these aspects and 
modifications are within the scope of the invention, which is limited only by the appended 
claims. 
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CLAIMS: 

1. A kit for use in automated genotyping within a population comprising at least 4 
GROUPS of at least three SETS each comprising labelled pairs of primers for amplification of 
DNA by polymerase chain reaction (PCR), 

each primer pair having unique sequence found in the flanking sequences of a 
microsatellite sequence comprising a nucleotide repeat sequence flanked by unique sequences, 
such that a polymerase chain reaction (PCR) primed witii the primer pair ampUfies the 
nucleotide repeat sequence and at least some immediately adjacent unique sequences of the 
microsatellite sequence to produce a PCR product identilied with the primer pair, wherein the 
microsateUite sequences are nucleotide repeat sequences that are polymorphic within the 
population, 

each SET consisting of at least 6 primer pairs, each primer having the sequence 
of unique sequences respectively flanking at least 6 microsateUite sequences in the genome, such 
that the length of the segment amplified by a particular primer pair differs from the length of 
all other segments in the SET by at least 5 nucleotides, and at least one primer of each primer 
pair is labeUed witii a fluorescent label tiiat is the same fluorescent label for all primer pairs in 
die SET, 

each GROUP consisting of at least tiiree SETS of primer pairs labelled with 
fluorescent labels, wherein the wavelength at which die respective fluorescent labels fluoresce 
is substantially different for the labelled primers in each of die respective SETS, 

wherein the distance in the genome between one microsatellite sequence amplified 
by a primer pair of the kit and the nearest odier microsatellite sequence amplified by another 
primer pair of the kit is at least 2 centimorgans (cM) and no more than 50 cM. 
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2. The kit of claim 1, wherein the PCR products identified with any primer pair 
amplifying microsatellite sequences containing dinucleotide repeats differ in length from PCR 
products identified with all other primer pairs of the same SET by at least 9 nucleotides. 

3. The kit of claim 1, wherein one of said GROUPS consists of the three SETs of 
Figure 7A. 

4. The kit of claim 1, wherein one of said GROUPS consists of the three SETs of 
Figure 7B. 

5. The kit of claim 1, containing the 6 SETs shown in Figures 7A and 7B. 
A method of analyzing genomic DNA for the presence of polymorphisms 



6. 

comprising 



a) extracting DNA fixim a human sample; 

b) combining, in a polymerase chain reaction (PCR) vessel, an aliquot of said 
DNA from a human sample, at least one primer pair selected from a GROUP in the kit of claim 
1, and PCR amplification enzymes; 

c) cycling the temperature of each PCR vessel so that PCR products identified 
with said at least one primer pair are produced by PCR amplification of segments from said 
DNA from a human sample, each vessel being cycled at an amiealing temperature wherein non- 
specific annealing of the primers to said DNA from a human sample is minimized; 

d) then combining aU PCR products from aU PCR vessels containing primer 
pairs from one GROUP into a mixture, and subsequenfly separating the mixture of PCR products 
electrophoretically by size; 

e) detecting separated PCR products by fiuorescence detection at wavelengths 
corresponding to the fiuorescent wavelength for each of the fluorescent labels in the kit. 
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7. The method of claim 6, wherein the step of combining amplified DNA further 
comprises: 

i) contacting each vessel with a plurality of paramagnetic beads carrying on 
the surface a protein which specificaUy binds biotin, further wherein one primer of each primer 
pair is labelled with a fiuorescent label and the other with biotin, for a period sufficient for said 
protein to bind biotin; 

ii) stparating the magnetic beads from the PCR reaction medium; 

iii) separating the two strands of the amplified DNA segments and combining 
the strands labelled with a fluorescent label for aU primer pairs from one GROUP into a 
mixture. 

8. The method of claim 6, wherein the step of combining ampUfied DNA from the 
PCR vessels further comprises: 

i) contacting each vessel witii a plurality of magnetic beads carrying DNA 
complementary to the sequence of one primer of the primer pair in the vessel for a period 
sufficient to allow annealing between the primer and the DNA on the magnetic beads; 

ii) separating the magnetic beads from the PCR reaction medium; and 

iii) eluting the PCR product from the magnetic beads. 

9. The method of claim 6, wherein each primer pair of said kit is added to a 
different PCR vessel in step (b), such that the araiealing temperature for temperature cycling in 
step (c) is the temperature wherein non-specific annealing of the unique primer pair is minimized 
and PCR product from aU PCR vessels containing at least one primer pair from the same 
GROUP are combined in a single mixture before electrophoretic separation. 
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10. A method for selecting a SET of PGR primers for use in automated geno,ypi„g 

comprising 

ftoM each by a, 2 ««,too^ ^ ^ ^^^^^ ^ 

population; 

«« o, ^„ ^ ^ »h >ta. tte tags. 

«=«n.6o..h=ta,gft„,3Upo,y„™pk,„^„^^^„„^^^^^_^^^ 

11. A Kt for usein aMomated genMyping compmtag al leas, 4 GROUPS of at leaa. 
3 SETS of PCR primen oblained by ihe method of daim 10. 

12. ™='*°f'l"i™ll,*ta.u,atlea«o„eprim«ofearhpdmerpairi.,heSEr 
is labeued a a™ Ubo, th« U the «™ (I^escen, ^ 

SET. 

13. kit of claim n, wheteto the length of poIymon*s of the DNA «gment 
"PiaWbyanyprtmerpair.mpli^micros«efflte^cescontaimhgdi.»aeotid^ 

-lftrsialeng*ft.m.h.„KAa.gmentampli,iedbyaUotherprimerpai.ofU.sameSErby 

at least 9 nucleotides. 

14. A method of analyzing genomic DNA for the presence of polymorphisms 
comprising 

a) extracting DNA from a human sample; 

SUBSTITUTE SHEET (RULE 26) 



wo 95/15400 



PCT/DS94/1394S 



-48 - 



b) combining, in a polymerase chain reaction (PCR) vessel, an aUquot of said 
DNA from a human sample, at least one primer pair selected from a GROUP in the kit of claim 
11, and PCR amplification enzymes; 

c) cycling tiie temperature of each PCR vessel so that PCR products 
consisting essentially of ampUfied DNA segments labelled with detectable labels are produced 
by PCR amplification and the PCR products for aU primer pairs in the SET are detectably 
labeUed with the same label, each vessel being cycled at an amiealing temperature wherein non- 
specific annealing is minimized; 

d) sqarating electrophoretically by size a mixhire containing aU PCR 
products amplified from said DNA from a human sample by any primer pair of said SET; 

e) detecting separated detectably labeUed PCR.products and characterizing 
them by length, 

15. nie method of claim 14, wherein the mixture in step (d) containing aU PCR 
products amplified from said DNA from a human sample by any primer pair of said SET is 
obtained by: 

i) contacting each vessel with a plurality of paramagnetic beads carrying on 
the surfece a protein which specifically binds biotin, further wherein one primer of each primer 
pair is labelled with a fluorescem label and the other with biotin, for a period sufficient for said 
protein to bind biotin; 

ii) separating the magnetic beads from the PCR reaction medium; 

iii) separating the two stiands of the ampUfied DNA segments and combining 
the strands labeUed with a fluorescent label for aU primer pairs from one GROUP into a 



mixture. 
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16. The method of claim 14. wherein the mixture in step (d) containing aU PGR 
products amplified from said DNA from a human sample by any primer pair of said SET is ^ 

obtained by: 

i) contacting each vessel with a plurality of magnetic beads carrying DNA 
complementary to the sequence of one primer of the primer pair in the vessel for a period 
sufficient to aUow annealing between the primer and the DNA on the magnetic beads; 

ii) separating the magnetic beads from the PGR reaction medium; and 

iii) eluting the PGR product from the magnetic beads. 

17. A kit for analysis by polymerase chain reaction (PGR) of a genomic region 
containing at least 6 known loci at which genetic rear^gement is diagnostic for a disea^. 
comprising at least one SET containing at least 6 PGR primer pairs, 

each primer pair having the sequence of unique sequences flanking one of said 
at least 6 lod of genomic reanangement. such that a polymerase chain reaction (PGR) prim«i 
with the primer pair amplifies the DNA segment surrounding the locus of rearrangement to 
pnxluce a PGR product of characteristic length, wherein the length of the PGR product is 
associated with specific diagnostic information, and wherein the length of the PGR product 
amplified by a particular pair of primers differs from the length of aU other PGR products 
amplified by other primers in the SETand the PGRpr«,ucts for all primer pairs in the SET are 
detectably labelled with the same label. 

18. Adiagnosticmethodfordetectionbypolymerasechainreaction(PGR)ofgenoM^ 
rearrangement in a genomic region containing at least 6 kno,^ ,oci at which genetic 
rearrangement is diagnostic for a disease, comprising 

(a) extracting DNA firom a human sample; 
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(b) combining, in a polymerase chain reaction (PGR) vessel, an aliquot of said 
DNA from a human sample, at least one pair of amplification primers selected from a SET of 
at least 6 primer pairs, and PGR amplification enzymes, each primer pair of said SET having 
the sequence of unique sequences flanking one of said at least 6 loci of genomic rearrangement, 
such that a polymerase chain reaction (PGR) primed with the primer pair amplifies the DNA 
segment surrounding the locus of rearrangement to produce a PGR product of characteristic 
length, wherein change in the length of the PGR product is associated with rearrangement at the 
locus of rearrangement, and wherein the length of PGR products amplified by a particular pair 
of primers differs from the length of all other PGR products amplified by other primers in the 
SET; 

c) cycling the temperature of each PGR vessel so that PGR products 
consisting essentially of amplified DNA segments labelled with detectable labels are produced 
by PGR amplification and tiie PGR products for all primer pairs in the SET are detectably 
labelled with the same label, each vessel being cycled at an annealing temperature wherein non- 
specific annealing is minimized; 

d) separating electrophoretically by size a mixture containing all PGR 
products amplified from said DNA from a human sample by any primer pair of said SET; 

e) detecting sq)arated detectably labelled PGR products and characterizing 
tiiem by length. 

19. The method of claim 14, wherein each primer pair of said SET is added to a 
different PGR vessel in step (b), such that the annealing temperature for temperature cycling in 
step (c) is the temperature wherein non-specific annealing of the unique primer pair is minimized 
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and PGR product from aU PGR vessels containing at least one primer pair from said SET are 
combined in a single mixture before electrophoretic separation. 
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FIG. 7A-I 


Marker 


Alleles (bp) 


Heterozygosity 


Chromosome 


SETA 








TH 


(308-322) 


75% 


11 


HGH 


(229-297) 


83% 


17 


D10S88 


(205-217) 


54% 


10 


D4S174 


(175-195) 


. 98% 


4 


AP0C2 


(129-165) 


80% 


19 


DI8S34 


(103-119) 


79% 


18 


D8S8S 


(74-84) 


79% 


8 


SETB 








D21S1I 


(260-352) 


82% 


21 


CYP2D 


(220-240) 


70% 


22 


D5S2I1 


(186-204) 


72% 


5 


D2S72 


(159-173) 


71% 


2 


D4S175 


(112-134) 


82% 


4 


D16S26I 


(88-100) 


67% 


16 


D13S7I 


(67-79) 


76% 


13 


SETC 








CYP19 


a75-304) 


91% 


15 


FABP2 


(230-250) 


64% 


4 


IGFl 


(176-196) 


54% 


12 


IL2RB 


(149-163) 


91% 


22 


D7S435 


(122-134) 


59% 


7 


D9S43 


(80-102) 


83% 


9 


D19S76 


(66-70) 


52% 


19 
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FIG. 7A-2 

GROUP 1 



A Primer 

5'-GTC AGC ACC CCA ACC AGC CT-S' 
5'-TCC AGC CTC GGA GAC AGA AT-3' 
5'-GTr AGC ATA ATG CCC TCA AG-3 ' 
5'.AAG AAC CAT GCG ATA CGA CT-3' 
5'-CAT AGC GAG ACT CCA TCT CC-3' 
5'-CAG AAA ATT CTC TCT GGC TA-3* 
5'-AGC TAT CAT CAC CCT ATA AAA T-3' 

5'.CTG TTA TGG GAC TTT TCT CA-3' 
5".AT0 ACT TCC CCA CTT TTT ACS' 
5'-ACT TTG AAA ACC ACT GGC CT-3' 
5'-AGC TAT AAT TGC ATC ATT GCA-S' 
5'-ATC TCT GTT CCC TCC CTG TT-S' 
5'-AAG CTT GTA TCT TTC TCA GG-3' 
5'-GTA TTT TTG GTA TGC TTG TGC-3 ' 



5'-AAT CTT CTT TTT TGT CTA TGA-3 • 
5'-GTG CCA TTT TAC AGT CTC CT-3' 
5*-GCT AGC CAG CTG GTG TTA TT-3' 
5'-GAG AGG GAG GGC CTG CGT TC-S* 
5'-TTA AAA TGT TGA AGG CAT CTT C-3 
5'-1TC TGA TAT CAA AAC CTG GC-S* 
5'-AAA AGT GTG TTA CTT TCA GAA C-3' 



B Primer 

5'-ACC GAA GAC CCC TCC TGT GG-3' 
5'.AGT CCT TTC TCC AGA GCA GGT.3' 
5'-CGA TGG AGT TTA TGT TGA GA-3 ' 
5'-CAT TCC TAG ATG GGT AAA GC.3 ' 
5'-GGG AGA GGG CAA AGA TCT AT-3' 
5'-CTC ATG TTC CTG GCA AGA AT-3 ' 
5'.AGT TTA ACC ATG TCT CTC CCG-3' 

5'.AAT GTA TGA AGT GGT ATG AT-3 ' 
5'-GCTGAG ATG GGA GGA TTG CT-3' 
5'-ATG TAT CTA GCC ATG GTA GC-3 ' 
5'-TGG TCT ATA ACT GGT CTA TG-3' 
5'-Crr ATT GGC CTT GAA GGT AG-3 ' 
5'-ATC TAC CTT GGC TGT CAT TG.3 ' 
5*-CTA nr TGG AAT ATA TGT GCC T-3' 



S'-CGTTTG ACT CCG TGT GTT TGA-3' 
S'-TTT CCA TTG TCT GTC CGT Tr-3 ' 
5'-ACC ACT CTG GGA GAA GGG TA-3' 
5'-CAC CCA GGG CCA GAT AAA GA-3 ' 
5'-TrrGAG TAG GTG GCA TCT CA-3' 
5'-AAG GAT ATT GTC CTG AGG A-3' 
5'-ACA AGG TGA CAA GGT GCC TA-3' 
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FIG. 7A'3 



Annealing Labeled 
Temperature Primer 
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62»C A 

54»C B 

58 "C A 

66»C A 

58«C A 

54»C A 



54»C B 
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58»C B 

58 'C B 

62«C B 

SS'C A 



58 -C B 

60*C A 

62»C B 

66»C A 

58 »C B 

58*C A 

58*»C B 
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FIG. 7B-I 

Marker Alleles (bp) Heterozygosity Chromosome 



SETA 



ATPSB 


(337-343) 


60% 


12 


GABRB 


(310-318) 


72% 


4 


TYR 


(286-298) 


58% 


11 


CFTR 


(258-276) 


82% 


7 


D11S534 


(228-244) 


74% 


11 


D11S420 


(188-208) 


66% 


1 1 


Leu-2/T8 


(138-170) 


71% 
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D9S53 


(93-127) 


87% 


9 


SETB 








CRP 


(331-349) 


56% 


1 


TCRD . 


(309-319) 


74% 


14 


IL-9 


(271-283) 


63% 


5 


D11S876 


(216-242) 


89% 


11 


SRC 


(193-207) 


71% 


20 


DI2S63 


(161-175) 


72% 


12 


D3S11 


(135-147) 


93% 


"I 


D2S102 


(102-126) 


86% 


2 


SETC 








D4S230 


(276-302) 


83% 


4 


D21S212 


(240-260) 


86% 


21 


D6S89 


(199-227) 


88% 


6 


FTHP 1 


(171-181) 


91% 


6 


D3S196 


(149-161) 


68% 


3 


D20S27 


(128-138) 


65% 


20 


D7S472 


(104-116) 


70% 


7 


D12S58 


(75-91) 


61% 


12 
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FIG, 7B-2 

GROUP 2 



A Primer 



B Primer 



5'-AAA CCC AAA CCC AGA GGA TT-S* 
5*.GGC ATG TCA TIT TCG TAA GC-3' 
5'-AAT ATG GOT ACA GCA TTG GA-S* 
5'-GAG CGA CAG CAA AAT CAG CC-3* 
i'-ATA TGG AAA CTC TCC GTA CT3' 
5'-AGT TAG ACC GGT TCT GCA GA-3* 
5'-ACT GCC TCA TCC AGT TTC AG-S* 
5'-TCC TGG err TAA ACT TCA CAC AC-3' 



5'-AGG TGG GTG GAT AAC TTG AG-3' 
5'-GTG GGC CAC ATT AGG AAC AG-3' 
5'-TGG GCG ATT TGT TCA TTG TG-3' 
5'-TGG AAG GAC GGG AAA TAA TA-3* 
S'-GCA ACC ATG GAG AGT CTG GA-3' 
5'-GAT TAA TGA TAG TGC TAT ZC-V 
S'-GAG CAG GCA CTT GTT AGA TG-y 
y-GGA ATA TGT TTT TAT TAG CTT GT-3' 



5*-GAA CAG AAC AGT GGA GCA TC-3' 
5'-TAG GAG GCA GAG GAT GGT TC-3' 
5*-CCC CAC TCT TAG CCA TTG TA-3' 
5'.TGG AGA TGT GCC ATA GAG GT-3' 
5*.TrC AAG TGG TTG CCT CTG GC-3' 
5'-ATG CTT TAT CCA GAG AAA AG-3' 
5*-CAA ACT TTC CAC AGT ATC GTT C-3' 
5'-CCA AAT GCT GGA GAC AGA GAG AA-3' 

y-rrc tca caa agt cac cac at-3* 

S'-GGC CTC CTG GAA TAA TTC TC-3* 
S'-CTT GTT CAT CTG CCT TGT GC-3 ' 
5'-ATC AAT GGA AAA ATG GGT AA-3' 
5'-ACT GGG GAA CAT GGT GGG GT-3' 
5*-TIT ATG CGA GCG TAT GGA TA-3' 
5'-TCC TCA AAA TGA AGA ACA CA-3' 
S'-CCT GGA AAA ATG GCT CAC C-3' 



S'^GGC ATA CGA GAA AAT ACT GT-3' 
5'-CAC CAG CCC CAT TCC TTA GC-3 ' 
5'-GAG ACA CAG AGC AAA TAG GT-3' 
S'-TCA GGA AAA CTG CCT GAG 0-3' 
5*-AGC AAC TTG CCC AGG CTA TGA-3' 
5'-CAT CAT TAA TTG GAT TGT GG-3' 
5'-GTT TCC TTG AGA AGA ATG GAG C-3' 
S'.ACC CCT CCC TCC CTC CAT CAC AC-3' 

5*-TAG GGA AAA TGA CAG GAA AA.3' 
S'-CAT TIT AAT GAA CAC CGC TC-3' 
5'-ACC TAA GCG ACT GCC TAA AC-3' 
5'-TAT CTT TCT CTG TCT GCC Tr-3' 
5'-ATG ATG ATT GCC AAA GGG AA-3' 
5'-CAC CAC CAT TGA TCT GGA AG-3' 
5'-AAA AGT CTA GTG TTG AGT GT-3' 
5'-GGA AAA TCA GTC TCT AGT TG-S' 
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FIG, 7B'3 



Annealing Labeled 
Temperature Primer 



65«C A 
68«C A 
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63 'C B 
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FIG. 7C-I 



Marker Alleles (bp) Heterozygosity Chromosome 

SETA 
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D1S244 
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D1S243 
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87% 




D1SI97 


(115-129) 


80% 




D1S226 


(90-106) 
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SET B 








D7S527 
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75% 
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D7S53I 


(241-255) 


77% 
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D7S529 


(218-226) 


68% 
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D1S215 


(189-207) 


73% 
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83% 
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(133-147) 


77% 




D2S157 
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75% 




D1S255 


(74-88) 


76% 
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95% 
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(267-279) 


75% 




D1S220 


(231-251) 
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(203-211) 
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D4S424 


(178-192) 


84% 
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77% 
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(111-113) 


79% 
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FIG. rc-2 

GROUP 3 



A Primer 



B Primer 



5'.GAC TTC ACC ATC AAC GCC IQ^V 
S'-GAG CAG CAC CGT ACA AAT-3' 
S'-TAA CAT GAG CGA ATG GAC AA-3* 
S*'GCC CAG GAG GTT.GAG G-3' 
5'-GGT ATG GAA GTC ACC CAA CAO' 
i'-CAC ACA GGC TCA CAT GCC-3* 
S'.TCA TGT CCC TCC TCC CAA AG-3' 
5*-GCT AGT CAG GCA TGA GCG-3' 

5'-CAT TGC AAA CTC AGG AGA TA-3' 
5'-AAA CTG TGG TCC TGG CTG-3 ' 
5*-AAA TTC TAG ACA TCG CCT GTA A-3' 
5*-GAC ACA GGT AGG TTA GAA GGA TG-3' 
5--CCA GNC TCG GTA TGT TTF TAC TA.3' 
5'.AAA AAC GTA CTG CCA CAT TC-3* 
5'-AGC CAG CAT TAC CTC TGN TAC C-3' 
5'-TrA GCA AAT CCC AAG CAA TA-3' 

5'-GGTGCC AGA CTA TGC AGA CCS' 
5*-GGC TGT GGG TGT TTC TCC TA-3' 
5'-GAT CGC CTA TGA CCT CCT TG.'3 
5*.TTA ATA AAA ATA CCC CCA CC-S' 
5'-GCG CTC TTG GTA TAT GGT ACA G-3' 
5'.GAA TGT GAA AGG CTG TGC-3' 
5'-TGG CCT GAA TAG ACC ATA AAA A-3' 
5*-CAA CAC CCA AAC AGA TGA CC-3' 

SUBSTITUTE 



5*-CAG GAA AGT GGA TGT GAC GA-3 ' 
5'-AGC TCC GCT CCC TGT AAT-3 * 
5'-CAA GGT TTC ACC ACA GTT CT-3* 
5'-AAG GCA GGC TTG AAT TAC AG-3* 
5'-CTC AAA ATG ACT GAT GGG GT.3' 
5'-GCT CCA GCG TCA TGG ACT-3' 
5'-GAG CAA GCA TCC AAA AAC GA.3 ' 
5'-GGT CAC TTG ACA TTC GTG G-3* 

5'-TAA CAG AGG CAT GAA AAC CA-3 ' 
5^AAA CTA GAG TCC TGG CCT GA-3* 
5'.GGT ACC ATC ACC ACA ATC AA-3' 
5*.TGT err GGT GAA TTG ACC CT-3 * 
5'.CTG AAA CCT CTG TCC AAG CC.3' 
5'-ACT TGT AGG CCT GTT CTG AG-3* 
5*-GAT CAC AGA TAT TGG CCC ATA G-3' 
5*.GTG ATG GTG GTA AAG GCA GA-3* 

5*-TAT GCT GAT TTA GGG AGC CC-3' 
i'.AGC TCT CAT GNC TTT ACA TTC T-3* 
5'-GCT GTC TGT GAG AGT TCG CA-3* 
5--GGA AAT AGG TGT GAA CAA AA-3 * 
5*.TGT GGG CAA CGT CAC TC-3' 
5*-AAA ATT ACA AAG AAG ACC-3 ' 
5'-GCC TGG GTG ACA AAG CA-3' 
5'-AGT CTT TCA TGG CCA CTG TG-3' 
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FIG, 7C-3 

Annealing Labeled 
Temp. Primer 



64« R 
58» R 
6I« R 
62* F 
62' R 
66» F 
60» F 
68» R 

64» R 
68» R 
64« F 
68' R 
70« R 
62' F 

68' F 

64' F 

68' R 

66' F 

70' F 

58' F 

64' R 

52' F 

64' F 

67' F 
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FIG. 7D'2 

GROUP 4 

A Primer 



5'.GAG GCA GGA GAA TCA CYl-V 
5''AGA TGA GGG GTA ATG TTG GA-3' 
5'-TTC GCT err TGA TAG GC-3* 

S'-ccc err gga aaa tca ctg-3' 

5'-CCT AAG tag GCA GTT GGT AT-S* 
5'-AAC TTA CAC ATT TGG CCC TG-3' 
5'-AAC TGC AAC ATT GAA ATG GC-3' 
5*-TGG AAA CTA TGT ATC TTG GAG G-S' 

5'-CAT ATG CAT ACC ACA CAC-3' 
5'-AGC TCA GAG ACA CCT CTC CA-3' 
S'-TCA GCC TGA GTT TTC TTT AT-3 ' 
5'-GGT CTG ATG AAA ATG TIC TCA AGC-3* 
5*-AAC GTC TGC TCG TCA GAG TC-3* 
5'-GCC TTG GGG GTA AAT ACT CT-3' 
S'-TTT TCT TTT TTG CAG TTT ATC C-3 * 
5'-ATC TTC CAA AAA TGT CAT-3 * 

5'-GGC CAG GCT TTG TTC AGA-3* 
5'-TTT AGC CTG AAA ATA CAC GC-3' 
5*-TGC ACA TTA AAG GAA CAG GT-3 ' 
5*-GAT CTG ATT AGT ATI GTC TGC TTG A-3 * 
5'-AAA TGT GAG TAG AAG GGA TAG GTT-3' 
5'-GAG TGG CGG TGA GAA GGT AT-3* 
5'-TGG AAT TTC TCC ATG TTG AG-3* 
5*-GAA AAG AAT GCT GGA TAG-3* 
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FIG. 7D-3 



B Primer 



S'-ATG GTT GTA GAT GAG ACT GG-3' 
5'-AAG CAT err AAT GGA TGG AAA-3 ' 
S'-ATT TCA TTT GTA ATT TAG TAG CAG-3* 
5'-CCA TGA ATA AGC CTT GCC-3* 
5'-CAC AGC AGG GGT TCA TIT Tr.3* 
5*-TCA ATC TGT GGA GTC ATT GG-3- 
5'-GGG ACC ATA GTT CTT GGT GA-3' 
5'-GCN GGC Trr AGG GTG G-3* 

5'-AAT CTT ATT GCT GTC TCA-3' 
5'-CTG TAT TAG GAT ACT TGG CTA TTG A-3' 
5'-CAA GGA GCA GGA AG A ACA GC-3* 
5*-TAG ACT GGG TTG TTA GGG ACT CTC-3' 
5'-CGA CTA CGT GCT GGC TAC TT-3* 
5*-GGA ATT ACA GGC CAC TGC TC.3* 
5*-CAC TTC AGT GCC TTC TTG AGA-3' 
5'-CAT AAT AGG AGA ATA AGA-3' 

5*-CAG GGT CTA TGA TAC GCT TT-S' 
5*-GCT TTG CTC CTA GAG TCC AG-3' 
5'-CAT AAT TTG CTG CTT TGG AT-3* 
5*-GCT TTA TAG GAG GTA TCT TTN TGT G-3* 
5'-TAA AAA AGN CCG ACT AGA CC.3' 
5*-AGC CAT TGC TAT CTT TGA GG-3' 
5'-AAG AGC TAT GAA AAG AGT TAA AGG A-3' 
5'-CCA GTT TTT ATG GAC GGG GT-3' 
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FIG. 7D'4 

Annealing Labeled 

Temp. Primer 

64" R 

59* F 

57« R 

59° F 

64» R 

6r F 

80« F 

66* F 

69» R 

64" R 

68' R 

60* R 

50« R 

66' R 

62* F 

58* R 

62* F 

64* R 

66* F 

62* F 

SUBSTITUTE SHEET (RULE 26) 



wo 95/15400 



PCT/US94/13945 



19/50 

FIG. TE-I 





Alleles (op) 


Heterozygosity 


Chromosome 


SET A 








D5S436 


(334-354) 


85% 


5 


D4S405 


(279-299) 


87% 


3 


D4S431 


(246-270) 


83% 


4 


D3SI303 


(196-220) 


78% 


3 


D3SI296 


(176-186) 


74% 


3 


D3S1271 


(146-158) 


75% 


3 


D4S407 


(111-135) 


87% 


4 


D4S404 


(89-101) 


79% 


4 


SETB 








D3S1279 


(264-282) 


86% 


3 


D2S176 


(240-250) 


70% 


2 


D2S15I 


(211-229) 


83% 


2 


D3S1305 


(189-198) 


74% 


3 


D4S403 


(155-169) 


76% 


4 


D4S411 


(135-143) 


66% 


4 


D2SI73 


(117-125) 


70% 


2 


D4S392 


(93-107) 


84% 


4 



SETC 

D5S4I8 

D6S313 

D6S314 

D6S289 

D7S513 

D7S492 

D7S478 

D6S294 



(297-313) 

(279-285) 

(243-259) 

(215-227) 

(173-201) 

(145-155) 

(118-130) 

(86-108) 



80% 
68% 
81% 
80% 
84% 
78% 
70% 
83% 



5 
6 
6 
6 
7 
7 
7 
6 



SUBSTITUTE SHEET (RULE 26) 



wo 95/15400 

PCTAJS94/13945 

20/50 

FIG, 7E-2 

GROUPS 

A Primer 

5'-AGG TCA TTG AGG TTT ATA TTC CCA-3' 
5'-ATC AGG AGA TGT TGC CTT GC-S' 
5'-AGG CAT ACT AGG CCG TAT T-3 * 
5*-CAG ACA ATG GCT TCC AAA AGT A-3 ' 
5'.CCT GAA GGG TGT AAT TFT CA-3 ' 
5'-TGA TTG GAG GTG GTA GAG GT-3* 
5*-ATA ATA TCC TTT GAT CCT TTC GCT A-3' 
5*-TTC CTC ATT TAG CTG CAC TAA G.3' 

5*-CAC CAT CTG TGT GGT ATT GG-3* 
S'-TZQ TGC ACT CGT TAT GAG AA-3 ' 
5'-AAC TAA GAC ACA CAA CCC CQ^V 
5'-CTG CTG GAA CTT AAA AGT GC.3' 
5'-CAA CAG ATC TCC CAA GGT AG-3* 
5'-AGG CTG TCT TGG CAG AAA T-3' 
5*-GAG GGC TGT TGA CCC AC-3' 
5*-TCG GTA AAC ATT CAT CCA GAO ' 

5*-AAA CAA AAT AGC CTT CAA AA-3 ' 
5*-TAG GCC CAA GGA ATT NAA AA-3* 
5'-AAA ATG ACT TCT TTG GGT GGG C-3' 
S'-TIC GCT GAG ATC ATG CCA C.3' 
S'-AGT GTT TTG AAG GTT GTA GGT TAA T-3 * 
5'-ATC TTG GAT TTA GGG TTG GC-3' 
5'-TGT GTC ATT ACG CTT TTC ATC-3 ' 
5'-TGC ATT GTT GTC ATG CCT-3 ' 
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FIG, 7E-3 



B Primer 



5'-GAA CCC TAG GAA GTG AAA TAG AAA A-3' 
5*-CAG GGC TAT GAT TGG ATG TC-3* 
S'-TTC CCA TCA GCG TCT TC-3' 
5*-CAA ACT TAG GGT TGT TCC TCA C-3* 
5'-TGA GAA GGT GTG TTA GGG TG-3' 
5'-AGC TAT CAT GTA GAA AAG CAG CA-3 * 
5'-AAA TIT GGT TAT TTT TAA GCA AAC T-3 ' 
S'-TTG CTA AAC CTT GGG TGT GT-3* 

5'.GAC CTA TIT TGG TTA ACA ATT TAG A-3' 
5*-CTG ATG GAG GTT AAG GCA AG-3* 
5*-CCA ATT CAG TGG CAT CTA TG-3* 
5*-AGA AAT GAG ATA TTG TTT TCG C-3' 
5'-CTC ATA ACT CAA AAC CTC TG-3* 
5*-GAT GTA ATC CTG TGC TAT GGC-3' 
S'-TTG CCT GGA AAC CTG GTA-3* 
5*.TGT CAA AAT GGA CCA ATC AG-3' 

5'-GCC TGG TAA GTT GAT AGT GT-3 * 
5'-TCA TCA TCA CCA CAA ATG CT-3' 
S'.GTG GGT AGC AAC ACT GTG GC-3' 
5'-AGA CCT TTA GGT TGT TCA TGC TG-3' 
5'-ATA TCT TTC AGG GGA GCA GG-3* 
5'-GGC TCT GCT CCA TCT TCA TA-3* 
5*-TCA AAT GGT TCA GGA GAA AGA-3' 
5*.TAA AGT CTC CAT CTT CGA TTG T-3' 
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FIG. 7E'4 

Annealing Labeled 
Temp. Primer 



68» F 

66» R 

62» F 

68» F 

61* F 

60» F 

64» F 

60* R 

66* R 

66* F 

60* R 

69* F 

66* R 

58" F 

60* F 

69* F 

68* F 

68* F 

62* F 

62* R 

66* F 
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FIG, 7F'I 



Maiker 


Alleles (bp) 


Heterozygosity 


Chromo! 


SETA 








D6S286 


(315-341) 


78% 


6 


D7S521 


(288-306) 


71% 


7 


D7S505 


(262-278) 


70% 


7 


D6S36l 


(221-251) 


77% 


6 


D7S518 


(179-201) 


88% 


7 


D6S292 


(141-161) 


83% 


6 


D6S264 


(108-122) 


71% 


6 


D6S268 


(79-93) 


75% 


6 


SETB 








D5S412 


(287-303) 


83% 


5 


D5S413 


(264-276) 


70% 


5 


D5S428 


(241-255) 


77% 


5 


D5S419 


(204-226) 


82% 


5 


D5S423 


(179-191) 


77% 


5 


D5S421 


(152-170) 


83% 


5 


D6S273 


(130-140) 


77% 


6 


D5S392 


(83-117) 


92% 


< 


SET C 








D7S517 


(341-335) 


83% 


7 


D8S265 


(284-307) 


75% 


8 


D8S282 


(260-272) 


73% 


8 


D8S272 


(192-239) 


82% 


8 


D7S530 


(170-182) 


78% 


7 


D8S275 


(139-157) 


76% 


8 


D8S255 


(107-129) 


74% 


8 


D7S520 


(79-97) 70% 
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FIG. 7F'2 

GROUP 6 

A Primer 



5'-TCA CCC CTA ATA CCC AAA AC-3' 
5'-AGT GGA CAG TTG GTA TCT CA-3' 
5'-ACT GGC CTG GCA GAG TCT-3' 
S'-CAC AAT CAT ATG TNC CAA TT-S* 
5'-CAG TAG GCA GGG GTG G-3' 
5'-AAT TCA CAA GAC ACA ATC TCA G.3* 
5'-AGC TGA CTT TAT GCT GTT CCT-3' 
5'-CAA CAT ACT GCC TCA AAA-3' 

5*-TTC GGC CAA AAA CAG AGT CC-3* 
5*-AGT CAC CTT CTC TGT CTC CA-3- 
S'.AAC ATC TTA GGG CAT CCT G-3' 
5*-ATC TIT TAT TGT GGG GTG CT-3* 
5*.CTG GGC AAC AAG AGT GAA AT.3* 
5'-TGG AAA TAG AAT CCA GGC TT-S* 
5*-GCA ACT TTT CTG TCA ATC CA-3' 
5'-GCT ATT CCC ACA AAG GCA-3* 

5*-ATC ATG GGA AGT GCG TGG.3' 
5'-CTr TCC TGC CAA CCT CTT TC-3' 
5'-GGG CAC AGG CAT GTG T-3* 
5'.GAG AAC TAA TCC CTT CTG GC-3' 
5'-TCC CTA CGT TGC ATT rrA-3' 
5'-AAA TCG CTA GAA AAT GTC CA-3* 
S'-TTT TGG AAT TTC TAG CCT CC-3* 
5*-CAA CAG GTC CAG GCT ATG TC-3* 
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FIG. 7F-3 



^ B Primer 

5'-AAT ATG AAG GGA TGT TGA AT-3* 
5'-TGT GAT CAG CCC AGG AAG AG-S' 
5*-CAG CCA TTC GAG AGG TGT-3* 
5'-Arr AAA TGT GCA TAC GCA AA-3 ' 
S'^QQQ TGT GTC TGT GTG ACA AC-3' 
S'.AGA ACT AAA GTT GCC TGT TCN TGT A-3 ' 

s'-rrr tcc atg ccc ttc tat ca-3 ' 

5*-TAC ACA AAA AGG AGG TCA TT-3* 

5'-TGA GAA CTT CCA CAT AGC AG-3- 
S'.AGG CCT CAT TCA AAA TCT GT-3' 
S'-AAT GAT TTA AAA TAG ATT AGG AGC A-3- 
S'.TGC CCA GAC TTC TCA CCT-3' 
5'-CAA ATT CCA CAA AGC CGT-S' 
5'-TCT ATC GTT AAC TTT ATT GAT TCA GO ' 
5'-ACC AAA CTT CAA ATT TTC GG.3' 
S*-QQC GGA TCA TTG AGT GC-3' 

S'-TAA TTA GTT GCT GGT TTG AA-3 * 
S'.TTG GGT TCA AGC GAT TCT CC-3 ' 
S'-QQC TGC ATT CTG AAA GGT TA-3* 
S'.AGC TTC ATA AAG AGT CTG GAA AAT-3 ' 
5'-TAC CCA GCC AAA CTA TTA.3- 
5'-TCA CAC CTG GGA ATT AGA AG-S * 
5'-TGA AAC CCA CAG ATA TTG GG-3* 
5'.TAT CCA TAC ACA CCA TGC CA.3 * 
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FIG. 7F'4 

Annealing Labeled 
Temp. Primer 

60° R 



PCT/US94/13945 



60» 
62° 
62" 
62« 



F 
R 
R 
R 



64" 



60° 
62° 
66° 
60° 
68° 



R 
F 
R 
R 
R 



56° R 

F 
F 
F 

56° F 
F 
F 
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Marker 

SETA 

D10S2I1 

D9SI63 

D12S102 

DI0S223 

D6S27I 

D9S153 

D9S170 

DI4S79 

SETB 

D4S402 

DI4S72 

DI3S221 

DI5S165 

DI4S68 

D14S64 

D13S175 

D15SI32 

SET C 

DI8S64 

D18S68 

DI8S66 

D15S118 
D16S420 
D18S59 
DI6S423 
D18S57 
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FIG, 

Alleles (bp) 



(289-305) 

(271-279) 

(241-259) 

(221-231) 

(166-208) 

(143-155) 

(108-126) 

(79-89) 



(301-321) 
(270-290) 
(244-262) 
(218-230) 
(179-201) 
(148-164) 
(121-139) 
(88-110) 



71% 

78% 

67% 

85% 

77% 

75% 

67% 



73% 

80% 

86% 

76% 

82% 

82% 

75% 

88% 



10 

9 

12 

10 

6 

9 

9 

14 



(287-323) 


92% 


4 


(257-271) 


83% 


14 


(223-243) 


83% 


13 


(184-208) 


80% 


15 


(148-172) 


89% 


14 


(126-136) 


77% 


14 


(69-83) 


76% 


13 


(74-88) 


76% 


15 



18 

18 

18 

15 

16 

18 

16 

18 
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FIG, 7G'2 

GROUP 7 

A Primer 

5'.GCT AGG ATT ACA GGC ACA T-3' 
5'-TGC TGC ACA TCT TAG GGA GT-3' 
5'-CTr TGC AGA ACC CAT GAT TAT GA-3' 
S'-AAT TCT GAA GAG GGA AAT CTA A-3 ' 
5'-AAC AAT TGG GAA ATG GCT TA-3' 
S'-TTA TGG CAG CCC AAA TGG ACT A-3* 
S'-CAG GCA CAC GCA TAC AC-3' 
S'-AGG TTG ATA GAC CAT GGA GAC A-3' 

5-.CTr ACT GTG TTG CCC AAG GT-3' 
S'-TGT AAA GTT TTG TAC ATG GTG TAA T-3' 
S'-TAG CCA TGA TAG GAA ATC AAC C-3' 
S'-GTT TAC GCC TCA TGG ATT TA-3' 
5'-GAG AGG TGG TTT TCA GTG GT-3' 
5*-GGG CAA CAC AGT GAG ACT CT-3' 
5*-TAT TGG ATA CTT GAA TCT GCT G-3 ' 
5'-CTG ATA ATA AAA CCA GGA AGA CAC-3 ' 

S'-TTC TGG AAA TGG ATA CTG GT-3' 
S'-ATG GGA GAC GTA ATA CAC CC-3' 
S'-AGA GCA AGT CCC TGC C-3' 
S'-TCA AAG ACC CAT ATC AAC CA-3' 
S'-ATT TCC TGA GGT CTA AAG CAC CC.3' 
S'-AGC TTC TAT CCA ACA GGG GC.3' 
5'-AAC AGG CTT GAA AGT CTC TGT C-3' 
S'.TTC AGG GTC TTT TGA AGA GG-3' 
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FIG. 7G'3 



B Primer 



S'.AGG CTC CTA CTA CCG TCA C-3' 
5'-ACA GCG CTC AGA AAT CAT ATA A-3' 
S'-ATT GCC TTG GAG GGC G-3* 
S'-AGG AAA ATA TAG ACA ACC CAA G-3' 
5'-TAG GTT GTG GTG GGT GTT AC-3' 
5*-GCA GAA TGT TGC CCA AAA CTC A-3* 
5'-ACT TCA GGA ATA GCC TFT ACC-3' 
5'-TIT TAT TGT TAT GTG GCT TTC A-3' 

S'.AGC TCT ATG ATT CAT TTC AAG TIT G-3* 
5*.TCC TAA CAT TCT GCT ACC CA.3* 
5'-GAG ATC GTG CAG CAC TTG T-3' 
5'-GGG CAC ACA GTC CCA A-S* 
5--TCA GGG ATA GTT GGT GGG TA-3* 
5*-TGG GAT AGA AGC AAC ACA GA-3* 
S'.TGC ATC ACC TCA CAT AGG TTAO' 
S'-TAT TGG CCT GAA GTG GTG-3 ' 

s'.rrr gga tgc aca gga agt tg-3* 

5'-ATG CTG CTG GTC TGA GG-3' 
S'.CAG CCT CGG AGA AAC G-3' 
5*-GTG CTG AAA AGC GAC ACT TA.3* 
5'-TTA GGC CCA GTC CAC ACT CAA G-3' 
5'-ACC AGA ATG TGA ACG ACC CT-3' 
5'-GCC TAT TTG ATA ATG CTG TAC G-3' 
5*-AGA AGG CAT TAA ATT TTG CA-3* 
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FIG. 7G-4 

Annealing Labeled 
■ Temp. Primer 



66« 



R 
F 



62« 



F 
R 
F 
R 



(A" F 
64» R 
64" F 
56» F 
R 

62» R 
R 

64" R 



R 
F 
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F/a 7H-I 



71% 
71% 



Marker Alleles (bp) Heterozygosity Chromosome 

SETA 

D1IS9I4 (275-285) 

D11S910 (249-261) 

DI7S784 (226-238) 

D22S274 (202-214) 

D19S2I6 (179-191) 

D21S259 (117-131) 

D20SI03 (92-106) 



70 

78% 
76% 
80% 
71% 



11 
11 
17 

22 
19 
21 
20 



SETB 

D12S89 

D10S205 

D12S101 

D12S91 

DI1S9Q2 

D10S249 

D11S903 



(254-288) 
(224-244) 
(195-213) 
(176-181) 
(145-163) 
(118-134) 
(99-109) 



79% 

90% 

82% 

70% 

81% 

75% 

75% 



12 

10 

12 

12 

11 

10 

11 



SET C 

D17S801 

D17S809 

D20S100 

DI9S213 

D18S58 

D18S52 

D17S793 



(258-336) 
(229-247) 
(194-218) 
(174-184) 
(144-160) 
(116-130) 
(95-109) 



72% 
77% 
69% 
74% 
77% 
70% 
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FIG, 7H-2 



GROUPS 



5'-ATC TCA TGG GAG TAG CGT TG-S' 
5*-AGC TTT GCA GAG AAG GCA AG-3' 
5'-GAG TCT OCT AAA TGC TGG GG-3' 
5'-GTC GAG GAG GTT GAT GC-3* 
5'-TCT TGT CAC TCT AAC TCC GC-3' 
5'-AGA ATG TGG TCT CAC AAG CC.3' 
^'-QTl CAT AG A GGG AC A AG A CAC AGT-3' 

5*-ATTTGA GAG CAG CGT GTT TT-3* 

5'-GGC ACT TGT AAT CCC CG-3* 

5*-CAA AAA AAT GTT TTA CTA AGC AGG-3' 

5'-TTC ACA ACA GCC AAT GGT AG-3* 

5'-CCC GGC TGT GAA TAT ACT TAA TGC-3* 

5'-AAC TGG rrr TGG tag TGA GA-3* 

5 '-AAC ACT TCG ATG TTC CTT CC-3' 



5'-CCT CAA ACC GGA CAA CTA TTT-3' 
5'-CAA AAA GGC AGA ATG CAG TA.3' 
S'.ATT GGG TIT ACT TGT GCC Tr-3' 
5*-CCTCCA ATC TGC ACC TGA CT-3' 
5*-GCT CCC GGC TGG TTT T-3' 
5*-TTN CAA CAT AGG TTA TAC GCG-3' 
5--TGT TGG AGT TAA TGT GCC AT-3 ' 




A Primer 



G 
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FIG. 7H-3 



B Primer 



Annealing 
Temp> 



Labeled 
Primer 



5'-GAC CCA CAT CAC CAT TAG TG-3' 
5*-TCC CTG CTC ATA ACT CAG CC-3' 
S'-AGC TCC TGC ACA GTT CTT AAA TA-3' 
5*-AGT GCC CAT TTC TCA AAA TA-3' 
S'-GGC CCA TGT CTT TIT TAG GT-3* 
5'-AGG GAA TGT CAA TGA AAA CC-3' 
5'-CCA TGA TGT TTG GTT AAT CAC A.3' 

5'-CCA TTA TGG GGA GTA GGG GT-3' 
5*.TGA GCC ACT GCA CCT Q-V 
S'-AGG CAT GAC TCA CCG C-3* 
5'-rrC TCA AGG TTC GTC CAT GT-3* 
5*-CCC AAC AGC AAT GGG AAG TT.3* 
5'.GAG GTG CCC GCT AGT A-3' 
5'-AGC TGA GAG CGC ATG TAT AA-3' 

5'-CAG AGA GCA AGA TCC TAC CTC-3' 
S'-TCC AGA GTC AAA AAC ACA GG-S* 
5*.CGT GAT TTC ATT TCT TGC TG-3 * 
5'-TAG GCT TTG TTC TGG GGT TC-3* 
5*-GCA GGA AAT CGC AGG AAC TT-S' 
5'-GGC CCA GTT CAT TTT CTA GC-3' 
5'-TCT TTG ACC CAG ACC TCT AA-3* 
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f'/s. ri-i 




Maricer 

SETA 

DI1S928 

DI7S849 

D19S2I7 

DnS935 

D12S90 

D12S105 

DI1S912 



(251-261) 

(219-233) 

(196-208) 

(166-182) 

(137-155) 

(101-123) 



SETB 

D21S261 

D19S220 

D12S88 

D7S480 

D15S120 

D10S210 

D20S119 



(296-304) 

(265-283) 

(217-255) 

(189-206) 

(150-174) 

(130-140) 

(104-118) 



71% 
67% 
76% 
74% 

73% 

72% 

81% 



50% 



85% 

80% 

74% 

80% 

823 



SETC 






D5S427 


(280-302) 


83% 


D4S412 


(237-249) 


76% 


D13S176 


(211-227) 


80% 


D10S212 


(189-201) 


71% 


D16S407 


(150-170) 


86% 


D11S969 


(141-149) 


76% 


D20S109 


(106-133) 


88% 



II 

17 
19 

II 

12 

12 

11 



21 

19 

12 

7 

15 

10 

20 



5 
4 
13 
10 
16 
U 
20 
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GR0UP9 

Prim er 

5'-MGTGATCCACC7X;ccn-G-3' 

^-GGGGITGATTCAAGTTGGTT.S. 

^••TACrAACCAAAAGAGlTGGGG.3. 
^•-AGCAGCAGCAGCCATATTGT-,. 
^'-TITACCTAAGGCTCGAirms. 
^•-^GTIGAGANTACTGCnTGG-S. 



^'■AMACACCTTACCTAAAACAGCA.3- 

^'-ATGlTCAGAAAGGCCATGrCAnTG.. 

^-^CACCACAGCATACCAGTA-a. 

5'-Crr GGG GAP Tr* 

"^ACTGAACCATCTT-S' 

^'-nTGTGATOGTCrrrT*^ 

^ TATaGG cat A-3• 
^'-CCTCAAroCACAACTCCT.3• 
-C-ACGACAG^TCAGTATC^.,,e-3. 



^•-GCCrrCACrAAGCAA^CTAAA-3. 
^'■ACTACCGCCAGGCACT-3' 

^'-CT^TGGGAntrcrrAG^X^ATAC.. 

^'-OAAGTAAAGCAAG^CTA.rCACG-3. 
^-CTCGCGCTGGGTACAGrrAT-3. 

^'-TTG ATT TGGAAG AIT to: AC-3- 
^•■AACACACATACAAACACACGCAGAT-3. 
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FIG. 71-3 

Annealing Labeled 

. BPnmer Temp. Primer 

5*-GeC TCT GAG AAT TAG TGT CTG TC-3 * 
5'-CTC TGG CTG AGG AGG C-3' 
5'.CAA GAC CCA TAC CCA TGA-S* 
5'.CTA TCA TTC AGA AAA TGT TCG C-3' 
5'-AGT CAG GCC CAC CCA ATT TA-3 ' 
5'-CAA AGT TGA CAC TGA TTA TAG CA-3 ' 
S'-TFT TGT CTA GCC ATG ATT GC-3' 



5'-AGA TGA TGG TGA GTC CTG AG-3' 
5'-TCC CTA ACG GAT ACA CAG CAA CAC.3' 
S'.AAT GAA CAG CAA AAA CTA AGG GA-3 ' 
5'--AGC TAC CAT AGG GCT GGA GG-3' 
5'-GGC TCA AAG TGT TTG CAC TG-3* 
5'.CTC AGA CCT GGG TCA AGA TA-3* 

CCA GAT TTA GGG GTG TAT G-V 



5'.ACA TGC TCT GAA TCA CCT GA-3' 
5'-CTA AGA TAT GAA AAC CTA AGG GA-3* 
5'-ATA TTC AGA CAA AAG CCA AGT TA-3' 
5'-TCT GTG TAC GTT GAA AAT CCC-3 ' 
5*- AGA TCA GAG GAG TGG GTT CC-3' 
S*'GGG GCA GAA TGG GTA T-3* 
y-TTC CAG ACA GGA CAG CCT GC.3' 

SUBSTITUTE SHEET (R^E 26) 



wo 95/15400 



37/50 



PCT/US94/13945 



FIG. 7J-I 


lYiarKcr 


Alleles (bp) 


Heterozygosity 


Chromosome 


fly If n i 

SET A 








D5S416 


(282-292) 


78% 


5 


D8S271 


(257-271) 


78% 


8 


D7S523 


(224-240) 


80% 


7 


D8S260 


(187-213) 


83% 


8 


D7S550 


(177-200) 


83% 


7 


D7S507 


(148-168) 


90% 


7 


D7S526 


(125-135) 


72% 


7 


D7S484 


(99-113) 


74% 


* 

7 


SETB 








D20SI06 






20 


D10S220 


(267-291) 


84% 


10 


D8S279 


(229-257) 


88% 


8 


D9S197 


(199-215) 


68% 


9 


D15S114 


(177-187) 


70% 


15 


D15SI25 


(157-169) 


79% 


15 


D8S264 


(121-145) 


84% 


8 



SET C 

D8S263 

D9S166 

D13S164 

D9S164 

D17S800 

D2S207 

D9S16I 



(275-289) 
(233-261) 
(208-219) 
(187-199) 
(168-178) 
(144-156) 
(119-135) 



75% 
82% 
72% 
80% 
74% 
71% 
78% 



8 
9 
13 
9q 
17 
2p 
9 
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FIG, 7J-3 




5'-AGT GAA ACT CGG NCC CTA-3' 
5'-AAC AAA CTT GCT TAT GAG TGT TAG T-3' 
5'-AAA ACA TIT CCA TTA CCA CTG-3 ' 
5'-GCT GAA GGC TGT TCT ATG GA-3' 
5'-GCA GTT GGG TFA TTT CAA GTC-3' ' 
5'-CTA CGT ACA TGG CTG CAA-S' 
5'-CCA TCT TGG TGT GAG GGC-S' 
S'-GCT GAG CAA GGC ATT GTT T-3' 

5--ACTGAG GTC ATG CAA GAG GC-3' 

5'-GAG CAA GAC TGC.ATC TCA AA-3' 

S'-GTG TCA GGT CGG GGT G.3' 

5'-ACG ATT TCT GGG AGA CTA TAT TGC-3' 

S'-TTG TCA CTG CTrTTC TCT GC-3- 

5'-CCC CTG AAG ACC GTG K.y 

5*-CCA ACA CCT GAG TCA GCA TA-3' 



5'-ATG TAA CAA AAT GGA GTC GG-3' 
S'-TCC TAA TTC ACT GGG AAA AC.3' 
S'-ATT ACA GGC GTG ACA CAC C.3' 
S'-GTFTGC CTG GGG ATT GAT TT-3' 
5'-ATA GAC TGT GTA CTG GGC ATF GA-3' 
5'. ATG AAG AAA TAT ATA CAG TGC CG-3' 
5'-CAT GCC TAG ACT CCT GAT CC-3' 
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FIG, TK-I 



Maricer 


Alleles fbp) 


Heterozygosity 


Chromosome 


SETA 








D5S408 


(247-299) 


73% 
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D9S180 


(220-265) 


63% 
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D5S4I4 


(186-206) 


82% 
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D1S304 


(168-206) 


60% 
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D6S344 


(139-159) 


72% 
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D12S76 


(112-124) 


71% 


12 


DI0S219 


(89-103) 


76% 


10 



SETB 



D11S906 


(291-303) 


73% 


11 


D15S121 
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2 
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D14S74 


(291-313) 
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(259-275) 
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D9S168 


(227-247) 
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D16S42I 
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FIG. 7K'2 

GROUP 1 1 

A Primer 



S'-ACA ACT TCC AAC CCT GAG AT-3* 
9'.CAG TOG TIT GGA ATC GAA CC-3 
S'-QQC CAG rrC AGT CAA GTG-S* 
5'-ACC err TTT CCT CCA ATC AT-3' 
i'-CTC CAG CCT GGG TCA CTA-3' 
S'^GGG CTA CAT GAT GAG ACC CT-3* 
S'-TCTTTC TAC CAC CCC CC-3' 

5'-AGC TGG GCA CCG ATA GTA GT.3' 

S'-TTG TAT CAG GGA TTT GGT TA-3' 

5*-CTC CAG CCT GCT GAC C-3* 

5*-GCA GAT GGA AAA CAC CAC TT-3* 

S'-ATG CTG GGA TCA CAG GC-3' 

S'-TTA AAA ATT AAG TAG GCT TTT GGT T-3 * 

5*.Crr AAG GCA AAA TTC TIT TCA ACA C-S ' 

5'-CCT GTA CCA CTA CCT GAG TTG AGT-3- 
5'-GAA err GCA TAA CCC GAA T-3- 
5'.GGT TTG TGG TCT TTG TAA GG-3* 
5'-ACA TGA ACC GAT TGG ACT GA-3* 
5'.CCC TGT TCC AGT AAT GAT GAC C-3 ' 
5*-TGC CAC TGT CTT GAA AAT CC.3' 

5'-GAA TAA AAC AGG GTT TGG G-3' 
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FIG. 7K'3 




S'-ACT GTG CCT AGC CTT CAT TT-S' 

5'-AGC TAT TTT TGG GGG CTG AG.3' 

5*-TGG rrC CAG CAT ATA GCG-3* 
5'-AGA AGC TGA AAG CTG AGT GG-3' 
5*-CTA ATG CAT GAC AAT AAT ATT TCC A-3' 
5*.GCG GAG CTT CTT TTC TGT TG'3' 
5'-GCA GAG AAC CTA AAG CAT CC-S' 

5'-GCA CAG GCA AAG ANG AGG TA-3' 

5--TGT TGT CGC TTC AGT ACA TA.3 ' 
5'-TCT TGG GCA AGC CAT C-3* 
5*-ACC TGC TGC TGG AAG ATT AC-3^ 
5*-AAC CTG GTG GAC TTT TGC T-3' 
S^GTC CTC ATG TGT TTA TGC TGT.3 ' 
5'.CTC AAA GTA AGA CCA TAA AAT ACC A-3' 

S'-CTTTGG CTG CCC GAA A-3' 
5*-CAA GGG TAT GTT CCC CAA AA-3' 
5*-TGG TTT GTT TGT ATA ACT ATC AT TG.3 ' 
5'-CCG Tree CTA TAT TTC CTG 0-3' 
5'-GTC TCT GGC TGC TCT CAA GAC TAT-V 
5'-TAT GGC CCA GCA ATG TGT AT-3- 

y-TTT CTC TAA GAA CTT TGG GG-3 * 
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FIG. 7L-I 



Alleles (bp) Hete rozygosity Chmm.c.^. 

SETA 



D19S209 (206-272) 

D14S77 (203-251) 

D10SI89 (180-188) 

DI2S87 (142-168) 

D13S158 (99-113) 
SETB 



77% 19 

92% 14 

72% 10 

79% 12 

81% 13 



D11S93I 


(251-267) 


73% 


11 


D16S4I5 


(208-234) 


72% 


16 


DI1S925 


(173-199) 


84% 


11 


D16S409 


(135-147) 


70% 


16 


DI3S219 


(117-127) 


64% 


13 


D22S284 


(86-102) 


76% 


22 
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D13S157 


(250-264) 


72% 


13 


D14S78 


(211-233) 


66% 


14 


D13SI68 


(173-197) 


76% 


13 


D15S122 


(143-159) 


77% 


15 


D18S70 


(111-126) 


83% 


18 
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FIG. 7L'2 

GROTJP t? 



A Primer 



^'-TTC ATT CAC AM TCN ATG GW 
S'.GCG TGA GTC ACT GTG CC-3' 
5'-CAA AAG TAA CCA TTG AGC CCS' 
S'-CAC TAG GTG ATG CTG GAC AT-3' 

5'-GTA CCC ACG GAG TGA AAG AA-S' 



5'-GAT TGC TTG AGC CCA G-3' 
5'-CCA GTA ATG TTA TGT AAG TCA ATG C-3- 
S'-AGA ACC AAG GTC GTA AGT CCT G-3' 
5'-TGA ATC TTA CAT CCC ATC CC-3' 
5'.AAG CAA ATA TGC AAA ATT GC-3' 
S'-ATG GGT ATT TAA CTT CTC TAC ACA G-S' 



S'-AGC TGA GAA ATC ACA ACA GAG A-3' 
S'-GGC AGG GAT AAG TAT GTC CT.3' 
5'-GCC TAG CCC AGT GGT G.3' 
5'-GAT AAT CAT GCC CCC CA-3' 
S'-AAG GCT CAN CTC TAC CG-S' 

SUesrin/TE SHEET fWAE 261 
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5'-CTGGAGAGCATAGACGNAGA-3' 
5'-CAG ACA GAA ATT AAC CAG AGT TGA 
S'-TTG ATA GAA GAA GCG ATA GAT CG-3 
5'-CTG CAC AAA CAC TTG AAA CA-3- 

5'-GCT TTG ACA ATT TAG CAG CA.3- 



5'-GAG AAA TAG TAT GTG TTT GCC-: 
5'-TAG CCA CTG TAC CCC AGC-3 > 
5--rrA CAC CAT TAT GGG GGC AA-S' 
5'-AGT CAG TCT GTC CAG AGG TG.3' 
5'-TCC TTC TGT TTC TTG ACT TAA CA 
S'-GCT arc TTG AGG TCG m CA-3' 



5 -TGG AAA TIT OCT GAC AGT AGA T-3' 

5'.AAA GGT AAC ATC CAA GGG GT-S' 
5'-TGC TTG TGC CTA TGT TCT TC-3' 
5'-CCC AGT ATC TGG CAC GTA G-S" 
5'-GGA ATG TCA AGA AGT ACC TAC CAT A-S ' 
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D13SIS6 

D19S226 

D16S422 

DliS65 

DI6S4I3 
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SETB 
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DI9S218 
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FIG, 7M-2 

■ GROUP 13 

A Primer 

5'-ATT AGC CCA GOT ATG GTG AC-3' 
5'-CCA OCA GAT TTT GGT GTT GTC TA-3' 
5'.CAG TGT AAC CTG GGG GC-3' 
5'-GAG GCA GGA AAT TGC AGT GT-3 ■ 
5'-ACT CCA GCC CGA GTA A-3' 

5'-AAA GCA AGG CTT CGT CTT AA-3' 

5':GCG ATC CAG CCT GTG T-3 ' 

5'-GAA ATG TCC TAT TTG AAA CTG TGC-3' 

5--CTG GTA GTG TCA GGC ATG GC-3- 

5'-ACC GTA GAC AGG ATG CCA-3 ' 



S'-AGC TGT TCA TGC TTC CAT CT-3* 
S'-TTT GCA TTT TCT GGA GTT TT-S ' 
5--GCT CCA GCC TAT CAG GAT G-3* 
5'-ATT GCC AGC CGT CAG TT-S' 
5'-TCA CAC TCA CTG GTC TCT CA-3- 
5'-GGG GCA TCT TTG GCT A-3' 
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FIG, 7M'3 




5--GCT GTG GTA TGA GTT ACT TAA ACA C-3' 
5'.GGT CCA GGA TIT GAA CTA AAG QK-V 
5'-CTr TCG ATT ACT TTA GCA GAA TGA Q-V 
5'-GCT GGT err ACT ATC TCA GGG G-3' 
5'-GGTCAC AGG TGG GTT C-S' 

5'-TrC >fTC ATT TTA TIG TGT GCG-3 ' 



5'-TGT AAA TGG GOT AAG TGA TGC-3' 
5'.CTG TTG AAA TGT ATC CAG TAA ATC Q.y 
5'-CCT ATG TTT CAG GCA AAG GC-S' 

5'-TGT GGG TIT TCT CAG GTT AT-3* 



5'-AGA GCC CAG AAT ATT GAC CC.3' 
S'-AAT GTC CCT AAA CAC ATG GA-3' 
5'-GAT TCC AGA TCA CAA AAC TGG T-3' 
5'.GAC CAG CAT ATC ATT ATA GAC AAG C-3' 
S'-GGT GTG CCT GTG TGT AAA AG-3' 
5'-TCC GGT TTG GTT CAG G-3' 
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FIG. 8 




SUBSrnurE SHEET (RULE 28) 
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